Abstract by Iain Lee

Personal Infomation

Presenter's Name

Iain Lee


Bobby Alger
Jesses Williams
Andrea Vukorepa
Scott Corbitt

Degree Level



Bobby Alger

Abstract Infomation


Computer Science

Faculty Advisor

Mark Clement


Census Record Image Segmentation



United States census records contain manually-filled forms of valuable demographic data. Images of these census documents are made publicly available, but only as images with some meta-data regarding location and date. Advances in machine learning provide an increasingly cost-effective and time-efficient solutions to data extraction. However, current handwriting recognition techniques only work on lines of text.  Therefore, in order to make use of the data found in U.S. census records we must first extract those lines.  We propose a method for extracting segments from the census in order to gather training data for handwriting recognition models. Our end goal is to automatically index the census records and this method will provide a crucial step in that process.  

Keywords: Image segmentation, Machine Learning, Handwriting recognition, Census data