p299
Use the poly() function to fit a cubic polynomial regression to predict nox using dis. Report the regression output, and plot the resulting data and polynomial fits.
Plot the polynomial fits for a range of different polynomial degrees (say, from 1 to 10), and report the associated residual sum of squares.
Perform cross-validation or another approach to select the optimal degree for the polynomial, and explain your results.
Use the bs() function to fit a regression spline to predict nox using dis. Report the output for the fit using four degrees of freedom. How did you choose the knots? Plot the resulting fit.
Now fit a regression spline for a range of degrees of freedom, and plot the resulting fits and report the resulting RSS. Describe the results obtained.
Perform cross-validation or another approach in order to select the best degrees of freedom for a regression spline on this data. Describe your results.
library(MASS)
library(tidyverse)
library(gridExtra)
g1 <- ggplot(Boston, aes(x = nox, y = dis)) +
geom_point(alpha = 0.5) +
geom_smooth(method = "lm", formula = y ~ x)
g2 <- ggplot(Boston, aes(x = nox, y = dis)) +
geom_point(alpha = 0.5) +
geom_smooth(method = "loess", formula = y ~ x)
g3 <- ggplot(Boston, aes(x = nox, y = dis)) +
geom_point(alpha = 0.5) +
geom_smooth(method = "lm", formula = y ~ poly(x, 4), se = FALSE)
g4 <- ggplot(Boston, aes(x = nox, y = dis)) +
geom_point(alpha = 0.5) +
geom_smooth(method = "lm",
formula = y ~ poly(x, 4),
level=0.95, # Default
se = TRUE) # Default
grid.arrange(g1, g2, g3, g4, ncol = 2)
fit = lm(nox ~ poly(dis, 4), data = Boston)
par(mfrow=c(2,2))
plot(fit)
coef(summary(fit))
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.55469506 0.002761339 200.8790240 0.000000e+00
## poly(dis, 4)1 -2.00309590 0.062114782 -32.2482963 2.540459e-124
## poly(dis, 4)2 0.85632995 0.062114782 13.7862506 6.924872e-37
## poly(dis, 4)3 -0.31804899 0.062114782 -5.1203430 4.356581e-07
## poly(dis, 4)4 0.03354668 0.062114782 0.5400757 5.893848e-01