BYU

Abstract by Peter Crawford

Personal Infomation


Presenter's Name

Peter Crawford

Degree Level

Undergraduate

Co-Authors

Drew Johnston

Abstract Infomation


Department

Mathematics

Faculty Advisor

Michael Dorff

Title

The Titanic: an Introduction to Data Science, Part 2

Abstract

Data Science is a field centered on extracting knowledge and insights from data in various forms. "The Titanic Problem" presented on Kaggle.com is an opportunity to learn important techniques of data science by building a model to predict whether or not passengers of the crashed ship called "The Titanic" survived based on data given about them. Developing a solution required our team to clean and prepare data, learn to work with machine learning models, optimize selected features of our data set, and take measures to prevent overfitting. In this presentation, I'll discuss alternative models and how they can be implemented in R, and how our team avoided overfitting.