Abstract by Drew Johnston
The Titanic: an Introduction to Data Science, Part 1
Data Science is a field centered on extracting knowledge and insights from data in various forms. "The Titanic Problem" presented on Kaggle.com is an opportunity to learn important techniques of data science by building a model to predict whether or not passengers of the crashed ship called "The Titanic" survived based on data given about them. Developing a solution required our team to clean and prepare data, learn to work with machine learning models, optimize selected features of our data set, and take measures to prevent overfitting. In this presentation, I'll discuss the problem in depth, as well as the steps taken to clean data and select features from the perspective of coding in Python.