class: misk-title-slide <br><br><br><br><br> # .font140[Multivariate Adaptive Regression Splines] --- # Prerequisites .pull-left[ .center.bold.font120[Packages] ```r # Helper packages library(dplyr) # for data wrangling library(ggplot2) # for awesome plotting # Modeling packages library(earth) # for fitting MARS models library(caret) # for automating the tuning process # Model interpretability packages library(vip) # for variable importance library(pdp) # for variable relationships ``` ] .pull-right[ .center.bold.font120[Data] ```r # ames data ames <- AmesHousing::make_ames() # split data set.seed(123) split <- initial_split(ames, strata = "Sale_Price") ames_train <- training(split) ``` ] --- # The idea * So far, we have tried to improve our linear model with various feature reduction and regularization approaches * However, we are still assuming linear relationships * The actual relationship(s) may have non-linear patterns that we cannot capture <img src="07-mars-slides_files/figure-html/non-linearity-1.png" style="display: block; margin: auto;" /> --- # The idea .font120[ * There are some traditional approaches we could take to capture non-linear relationships: - polynomial relationships - step function relationships ] <img src="07-mars-slides_files/figure-html/traditional-nonlinear-approaches-1.png" style="display: block; margin: auto;" /> <br> .center.bold.blue[However, these require the user explicitly identify & incorporate
<img src="https://emojis.slackmojis.com/emojis/images/1542340473/4983/yuck.gif?1542340473" style="height:2em; width:auto; "/>
] --- # The idea .pull-left[ * Multivariate adaptive regression splines (MARS) provide a convenient & automated approach to capture non-linearity * Easy transition from linear regression to non-linearity methods * Looks for .blue[knots] in predictors <br><br> `\begin{equation} \text{y} = \begin{cases} \beta_0 + \beta_1(1.183606 - \text{x}) & \text{x} < 1.183606, \\ \beta_0 + \beta_1(\text{x} - 1.183606) & \text{x} > 1.183606 \end{cases} \end{equation}` ] .pull-right[ <img src="07-mars-slides_files/figure-html/one-knot-1.png" style="display: block; margin: auto;" /> ] --- # The idea .pull-left[ * Multivariate adaptive regression splines (MARS) provide a convenient & automated approach to capture non-linearity * Easy transition from linear regression to non-linearity methods * Looks for .blue[knots] in predictors <br><br> `\begin{equation} \text{y} = \begin{cases} \beta_0 + \beta_1(1.183606 - \text{x}) & \text{x} < 1.183606, \\ \beta_0 + \beta_1(\text{x} - 1.183606) & \text{x} > 1.183606 \quad \& \quad \text{x} < 4.898114, \\ \beta_0 + \beta_1(4.898114 - \text{x}) & \text{x} > 4.898114 \end{cases} \end{equation}` ] .pull-right[ <img src="07-mars-slides_files/figure-html/two-knots-1.png" style="display: block; margin: auto;" /> ] --- # The idea .pull-left[ * Multivariate adaptive regression splines (MARS) provide a convenient & automated approach to capture non-linearity * Easy transition from linear regression to non-linearity methods * Looks for .blue[knots] in predictors ] .pull-right[ <img src="07-mars-slides_files/figure-html/three-knots-1.png" style="display: block; margin: auto;" /> ] --- # The idea .pull-left[ * Multivariate adaptive regression splines (MARS) provide a convenient & automated approach to capture non-linearity * Easy transition from linear regression to non-linearity methods * Looks for .blue[knots] in predictors ] .pull-right[ <img src="07-mars-slides_files/figure-html/four-knots-1.png" style="display: block; margin: auto;" /> ] --- # The idea .pull-left[ * Multivariate adaptive regression splines (MARS) provide a convenient & automated approach to capture non-linearity * Easy transition from linear regression to non-linearity methods * Looks for .blue[knots] in predictors ] .pull-right[ <img src="07-mars-slides_files/figure-html/nine-knots-1.png" style="display: block; margin: auto;" /> ] --- # R packages 📦 .pull-left[ ## [`mda`](https://cran.r-project.org/package=mda) * **m**ixture **d**iscriminant **a**nalysis * Lightweight function `mars()` * Gives quite similar results to Friedman's original FORTRAN program * No formula method ] .pull-right[ ## [`earth`](http://www.milbo.users.sonic.net/earth/) 🌎 * **e**nhanced **a**daptive **r**egression **t**hrough **h**inges * Derived from `mda::mars()` * Support for GLMs (e.g., logistic regression) * More bells and whistles than `mda::mars()`; for example, - Variable importance scores - Support for `\(k\)`-fold cross-validation) ] --- # Tuning parameters MARS models have two tuning parameters: .pull-left[ 1. .blue[_nprune_]: the maximum number of terms in the pruned model (including the intercept) 2. .blue[_degree_]: the maximum degree of interaction ] .pull-right[ ```r caret::getModelInfo("earth")$earth$parameters ## parameter class label ## 1 nprune numeric #Terms ## 2 degree numeric Product Degree ``` ] --- # Implementation .scrollable90[ .pull-left[ ```r # tuning grid hyper_grid <- expand.grid( nprune = seq(2, 50, length.out = 10) %>% floor(), degree = 1:3 ) # perform resampling set.seed(123) cv_mars <- train( Sale_Price ~ ., data = ames_train, trControl = trainControl(method = "cv", number = 10), * method = "earth", tuneGrid = hyper_grid, metric = "RMSE" ) # best model cv_mars$results %>% filter( nprune == cv_mars$bestTune$nprune, degree == cv_mars$bestTune$degree ) ## degree nprune RMSE Rsquared MAE RMSESD RsquaredSD MAESD ## 1 1 44 26334.75 0.8929768 16789.34 3952.154 0.02517833 920.1562 ``` ] .pull-right[ ```r # plot results plot(cv_mars) ``` <img src="07-mars-slides_files/figure-html/cv-mars-plot-1.png" style="display: block; margin: auto;" /> ] ] --- # Feature importance * Backwards elimination feature selection routine that looks at reductions in the GCV estimate of error as each predictor is added to the model. * This total reduction is used as the variable importance measure (`value = "gcv"`). * Can also monitor the change in the residual sums of squares (RSS) as terms are added (`value = "rss"`) .bold.center[Automated feature selection] .scrollable90[ ```r p1 <- vip(cv_mars, num_features = 40, geom = "point", value = "gcv") + ggtitle("GCV") p2 <- vip(cv_mars, num_features = 40, geom = "point", value = "rss") + ggtitle("RSS") gridExtra::grid.arrange(p1, p2, ncol = 2) ``` <img src="07-mars-slides_files/figure-html/mars-vip-1.png" style="display: block; margin: auto;" /> ] --- # Partial dependence plots ```r # Construct partial dependence plots p1 <- partial(cv_mars, pred.var = "Gr_Liv_Area", grid.resolution = 10) %>% ggplot(aes(Gr_Liv_Area, yhat)) + geom_line() p2 <- partial(cv_mars, pred.var = "Year_Built", grid.resolution = 10) %>% ggplot(aes(Year_Built, yhat)) + geom_line() p3 <- partial(cv_mars, pred.var = c("Gr_Liv_Area", "Year_Built"), grid.resolution = 10) %>% plotPartial(levelplot = FALSE, zlab = "yhat", drape = TRUE, colorkey = TRUE, screen = list(z = -20, x = -60)) # Display plots side by side gridExtra::grid.arrange(p1, p2, p3, ncol = 3) ``` <img src="07-mars-slides_files/figure-html/pdp-1.png" style="display: block; margin: auto;" /> --- class: clear, center, middle, hide-logo background-image: url(images/any-questions.jpg) background-position: center background-size: cover --- # Back home <br><br><br><br> [.center[
<i class="fas fa-home fa-10x faa-FALSE animated "></i>
]](https://github.com/misk-data-science/misk-homl) .center[https://github.com/misk-data-science/misk-homl]