ISLR Home

Question

p124

This problem involves simple linear regression without an intercept.

  1. Recall that the coefficient estimate \(\hat{\beta}\) for the linear regression of Y onto X without an intercept is given by (3.38). Under what circumstance is the coefficient estimate for the regression of X onto Y the same as the coefficient estimate for the regression of Y onto X?

  2. Generate an example in R with n = 100 observations in which the coefficient estimate for the regression of X onto Y is different from the coefficient estimate for the regression of Y onto X

  3. Generate an example in R with n = 100 observations in which the coefficient estimate for the regression of X onto Y is the same as the coefficient estimate for the regression of Y onto X.


library(ISLR)

12a

The coefficent estimate for regression of X onto Y will be the same for coefficent estimate for regression of Y onto X when the there is no irreducible error and there is a perfect linear relationship between x and y (y=x).

Creating arbitrary data

set.seed(0)
x = rnorm(100)

12b

Equation Y onto X

y = 2 * x + rnorm(100)

Fit y onto x without intercept: y ~ 0 + x

lm.fit.no.intercept = lm(y ~ 0 + x)
summary(lm.fit.no.intercept)
## 
## Call:
## lm(formula = y ~ 0 + x)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.6391 -0.8650 -0.2032  0.5898  2.7879 
## 
## Coefficients:
##   Estimate Std. Error t value Pr(>|t|)    
## x   2.1374     0.1092   19.58   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9589 on 99 degrees of freedom
## Multiple R-squared:  0.7948, Adjusted R-squared:  0.7927 
## F-statistic: 383.4 on 1 and 99 DF,  p-value: < 2.2e-16

Fit x onto y without intercept: x ~ 0 + y

lm.fit.x.y = lm(x ~ 0 + y)
summary(lm.fit.x.y)
## 
## Call:
## lm(formula = x ~ 0 + y)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.22971 -0.24830  0.04216  0.34170  0.71230 
## 
## Coefficients:
##   Estimate Std. Error t value Pr(>|t|)    
## y  0.37185    0.01899   19.58   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4 on 99 degrees of freedom
## Multiple R-squared:  0.7948, Adjusted R-squared:  0.7927 
## F-statistic: 383.4 on 1 and 99 DF,  p-value: < 2.2e-16

12c

Equation

y = x

Fit y onto x: y ~ 0 + x

lm.fit.y.x = lm(y ~ 0 + x)
summary(lm.fit.y.x)
## Warning in summary.lm(lm.fit.y.x): essentially perfect fit: summary may be
## unreliable
## 
## Call:
## lm(formula = y ~ 0 + x)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -6.121e-16 -3.665e-17 -8.400e-19  4.368e-17  2.976e-16 
## 
## Coefficients:
##    Estimate Std. Error   t value Pr(>|t|)    
## x 1.000e+00  1.058e-17 9.449e+16   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.297e-17 on 99 degrees of freedom
## Multiple R-squared:      1,  Adjusted R-squared:      1 
## F-statistic: 8.928e+33 on 1 and 99 DF,  p-value: < 2.2e-16

Fit x onto y: x ~ 0 + y

lm.fit.x.y = lm(x ~ 0 + y)
summary(lm.fit.x.y)
## Warning in summary.lm(lm.fit.x.y): essentially perfect fit: summary may be
## unreliable
## 
## Call:
## lm(formula = x ~ 0 + y)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -6.121e-16 -3.665e-17 -8.400e-19  4.368e-17  2.976e-16 
## 
## Coefficients:
##    Estimate Std. Error   t value Pr(>|t|)    
## y 1.000e+00  1.058e-17 9.449e+16   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.297e-17 on 99 degrees of freedom
## Multiple R-squared:      1,  Adjusted R-squared:      1 
## F-statistic: 8.928e+33 on 1 and 99 DF,  p-value: < 2.2e-16