Abstract by Ruo-Yun Wang
Comparison of Classification Models
A classification model is built to answer research questions with a binary response variable. The car crash data set includes vehicle-related crashes. The SEVERITY variable is an indicator if at least one person in the accident involved a serious injury. The main goal of this study is to compare three classification models: logistic regression model, random forest, and logistic regression model using elastic net method. This will be done by first splitting the data set into training set and test set. Then, we will use confusion matrix to compare the prediction accuracy of each model. Finally, for a given driving condition, we will use each model to predict the probability of having a severe car crashes and compare the results.