p369
We have seen that we can fit an SVM with a non-linear kernel in order to perform classification using a non-linear decision boundary. We will now see that we can also obtain a non-linear decision boundary by performing logistic regression using non-linear transformations of the features.
Generate a data set with n = 500 and p = 2, such that the observations belong to two classes with a quadratic decision boundary between them. For instance, you can do this as follows:
Plot the observations, colored according to their class labels. Your plot should display X1 on the x-axis, and X2 on the y- axis.
Fit a logistic regression model to the data, using X1 and X2 as predictors.
Apply this model to the training data in order to obtain a pre- dicted class label for each training observation. Plot the ob- servations, colored according to the predicted class labels. The decision boundary should be linear.
Now fit a logistic regression model to the data using non-linear functions of X1 and X2 as predictors (e.g. X12, X1 ×X2, log(X2), and so forth).
Apply this model to the training data in order to obtain a pre- dicted class label for each training observation. Plot the ob- servations, colored according to the predicted class labels. The decision boundary should be obviously non-linear. If it is not, then repeat (a)-(e) until you come up with an example in which the predicted class labels are obviously non-linear.
Fit a support vector classifier to the data with X1 and X2 as predictors. Obtain a class prediction for each training observa- tion. Plot the observations, colored according to the predicted class labels.
Fit a SVM using a non-linear kernel to the data. Obtain a class prediction for each training observation. Plot the observations, colored according to the predicted class labels.
Comment on your results.
set.seed(421)
x1=runif(500)-0.5
x2=runif(500)-0.5
y=1*(x1^2 - x2^2 > 0) # Multiply by 1 so we don't get a boolean
#dat = data.frame(x1=x1, x2= x2, y=as.factor(y))
dat = data.frame(x1=x1, x2= x2, y=y)
plot(dat$x1, dat$x2, col=(3-y), pch=4)
Fit a logistic regression model to the data, using X1 and X2 as predictors.
#set.seed(1)
train = sample(nrow(dat), nrow(dat)*.98)
lr.fit = glm(y~., data = dat, family = binomial)
#lr.fit = glm(y ~ x1 + x2, data = dat[train,], family = binomial)
summary(lr.fit)
##
## Call:
## glm(formula = y ~ ., family = binomial, data = dat)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.278 -1.227 1.089 1.135 1.175
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.11999 0.08971 1.338 0.181
## x1 -0.16881 0.30854 -0.547 0.584
## x2 -0.08198 0.31476 -0.260 0.795
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 691.35 on 499 degrees of freedom
## Residual deviance: 690.99 on 497 degrees of freedom
## AIC: 696.99
##
## Number of Fisher Scoring iterations: 3
lr.predict.train.probs = predict(lr.fit, newdata = dat, type = "response")
#preds = rep(0, 250)
#preds[lr.predict.train.probs >= 0.5] = 1
preds = ifelse(lr.predict.train.probs > 0.52, 1, 0)
table(preds, dat[,"y"])
##
## preds 0 1
## 0 39 96
## 1 196 169
(28+109)/250 # 54% accuracy
## [1] 0.548
(39+169)/500 # 54% accuracy
## [1] 0.416
Note, 3 - preds converts to 3,2 for colors
Positive is ‘+’, blue
Negative is ‘x’, red
dat.neg = dat[preds==0,]
dat.pos = dat[preds==1,]
{plot(dat.pos$x1, dat.pos$x2, col=("blue"), pch="+", xlab = "X1", ylab = "X2")
points(dat.neg$x1, dat.neg$x2, col=("red"), pch=4)
}
Prince used all the data. Breaking out the training data gave us very different results.
lm.fit = glm(y ~ x1 + x2, family = binomial)
data = data.frame(x1 = x1, x2 = x2, y = y)
lm.prob = predict(lm.fit, data, type = "response")
lm.pred = ifelse(lm.prob > 0.52, 1, 0)
data.pos = data[lm.pred == 1, ]
data.neg = data[lm.pred == 0, ]
plot(data.pos$x1, data.pos$x2, col = "blue", xlab = "X1", ylab = "X2", pch = "+")
points(data.neg$x1, data.neg$x2, col = "red", pch = 4)
Now fit a logistic regression model to the data using non-linear functions of X1 and X2 as predictors (e.g. X12, X1 ×X2, log(X2), and so forth)
#plot(lr.fit)
#library(glmnet)
#lr.fit.poly = glm(y~poly(x1,2) + poly(x2,2), data = dat[train,], family = binomial)
#lm.fit = glm(y ~ poly(x1, 2) + poly(x2, 2) + I(x1 * x2), data = data, family = binomial)
lr.fit.poly = glm(y~poly(x1,2) + poly(x2,2), data = dat, family = binomial)
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#summary(lr.fit.poly)
pred.poly = predict(lr.fit.poly, dat, family = binomial)
lm.poly.pred = ifelse(pred.poly > 0.5, 1, 0)
dat.neg = dat[lm.poly.pred==0,]
dat.pos = dat[lm.poly.pred==1,]
{plot(dat.pos$x1, dat.pos$x2, col=("blue"), pch="+", xlab = "X1", ylab = "X2")
points(dat.neg$x1, dat.neg$x2, col=("red"), pch=4)
}
library(e1071)
dat.svm = data.frame(x1=x1, x2= x2, y=as.factor(y))
svm.fit <- svm(y ~ x1 + x2, data = dat.svm, kernel = "linear")
svm.pred = predict(svm.fit, dat.svm)
dat.neg = dat[svm.pred==0,]
dat.pos = dat[svm.pred==1,]
{plot(dat.pos$x1, dat.pos$x2, col=("blue"), pch="+", xlab = "X1", ylab = "X2")
points(dat.neg$x1, dat.neg$x2, col=("red"), pch=4)
}
svm.fit <- svm(y ~ x1 + x2, data = dat.svm, kernel = "radial")
svm.pred = predict(svm.fit, dat.svm)
dat.neg = dat[svm.pred==0,]
dat.pos = dat[svm.pred==1,]
{plot(dat.pos$x1, dat.pos$x2, col=("blue"), pch="+", xlab = "X1", ylab = "X2")
points(dat.neg$x1, dat.neg$x2, col=("red"), pch=4)
}
```