BYU

Abstract by Wilson Fearn

Personal Infomation


Presenter's Name

Wilson Fearn

Co-Presenters

None

Degree Level

Undergraduate

Co-Authors

None

Abstract Infomation


Department

Computer Science

Faculty Advisor

Kevin Seppi

Title

Feature Hashing to Improve Interactive Topic Modeling Speeds

Abstract

3M

Interactive topic models are useful for interacting with large bodies of text without the need for expertise in machine learning. This capability is becoming more important as the amount of data we want to analyze and process increases. Current methods in topic modeling allow us to interact with large datasets, but not datasets at net scale. We propose the method of feature hashing the corpus vocabulary to speed up topic modeling without sacrificing topic usefulness. The results of our experiments show that this method is feasible and may as much as halve the speed of the topic modeling process while retaining topic quality.