Abstract by Mason Poggemann
A Comparison of Machine Learning Classifiers Across Platforms
More and more people have begun to rely more and more on machine learning to process their data and make predictions. To do so, data is fed through an algorithm called a classifier that uses patterns to make predictions. There is a wide variety of classifiers, and since they all perform differently on different types of data it is difficult to know which processes to use in a given situation. Our goal is to apply machine learning to machine learning itself to make predictions for us about what will work best.
In order to make predictions about the dataset that we have, we first need to know a few things. One simple way to gather some information about the data is to run it through a simplified version classifier to get a feel for how it would perform at full scale. The issue is, different programs have slightly different implementations of even the same algorithms, producing slightly different results. My aim is to compare several of the most popular implementations, mainly Weka, a package from the University of Waikato, and Sci-kit Learn, a data-mining library on Python and see if there is any significant difference in how they perform.