Abstract by Peter Crawford
The Titanic: an Introduction to Data Science, Part 2
Data Science is a field centered on extracting knowledge and insights from data in various forms. "The Titanic Problem" presented on Kaggle.com is an opportunity to learn important techniques of data science by building a model to predict whether or not passengers of the crashed ship called "The Titanic" survived based on data given about them. Developing a solution required our team to clean and prepare data, learn to work with machine learning models, optimize selected features of our data set, and take measures to prevent overfitting. In this presentation, I'll discuss alternative models and how they can be implemented in R, and how our team avoided overfitting.