p124
This problem involves simple linear regression without an intercept.
Recall that the coefficient estimate \(\hat{\beta}\) for the linear regression of Y onto X without an intercept is given by (3.38). Under what circumstance is the coefficient estimate for the regression of X onto Y the same as the coefficient estimate for the regression of Y onto X?
Generate an example in R with n = 100 observations in which the coefficient estimate for the regression of X onto Y is different from the coefficient estimate for the regression of Y onto X
Generate an example in R with n = 100 observations in which the coefficient estimate for the regression of X onto Y is the same as the coefficient estimate for the regression of Y onto X.
library(ISLR)
The coefficent estimate for regression of X onto Y will be the same for coefficent estimate for regression of Y onto X when the there is no irreducible error and there is a perfect linear relationship between x and y (y=x).
set.seed(0)
x = rnorm(100)
Equation Y onto X
y = 2 * x + rnorm(100)
lm.fit.no.intercept = lm(y ~ 0 + x)
summary(lm.fit.no.intercept)
##
## Call:
## lm(formula = y ~ 0 + x)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.6391 -0.8650 -0.2032 0.5898 2.7879
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## x 2.1374 0.1092 19.58 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9589 on 99 degrees of freedom
## Multiple R-squared: 0.7948, Adjusted R-squared: 0.7927
## F-statistic: 383.4 on 1 and 99 DF, p-value: < 2.2e-16
lm.fit.x.y = lm(x ~ 0 + y)
summary(lm.fit.x.y)
##
## Call:
## lm(formula = x ~ 0 + y)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.22971 -0.24830 0.04216 0.34170 0.71230
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## y 0.37185 0.01899 19.58 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4 on 99 degrees of freedom
## Multiple R-squared: 0.7948, Adjusted R-squared: 0.7927
## F-statistic: 383.4 on 1 and 99 DF, p-value: < 2.2e-16
Equation
y = x
lm.fit.y.x = lm(y ~ 0 + x)
summary(lm.fit.y.x)
## Warning in summary.lm(lm.fit.y.x): essentially perfect fit: summary may be
## unreliable
##
## Call:
## lm(formula = y ~ 0 + x)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.121e-16 -3.665e-17 -8.400e-19 4.368e-17 2.976e-16
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## x 1.000e+00 1.058e-17 9.449e+16 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.297e-17 on 99 degrees of freedom
## Multiple R-squared: 1, Adjusted R-squared: 1
## F-statistic: 8.928e+33 on 1 and 99 DF, p-value: < 2.2e-16
lm.fit.x.y = lm(x ~ 0 + y)
summary(lm.fit.x.y)
## Warning in summary.lm(lm.fit.x.y): essentially perfect fit: summary may be
## unreliable
##
## Call:
## lm(formula = x ~ 0 + y)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.121e-16 -3.665e-17 -8.400e-19 4.368e-17 2.976e-16
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## y 1.000e+00 1.058e-17 9.449e+16 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.297e-17 on 99 degrees of freedom
## Multiple R-squared: 1, Adjusted R-squared: 1
## F-statistic: 8.928e+33 on 1 and 99 DF, p-value: < 2.2e-16