BYU

Abstract by Brandon Carter

Personal Infomation


Presenter's Name

Brandon Carter

Co-Presenters

None

Degree Level

Masters

Co-Authors

None

Abstract Infomation


Department

Statistics

Faculty Advisor

David Dahl

Title

Clustering via Pairwise Distance Information

Abstract

Cluster analysis is commonly performed for exploratory data analysis. Hierarchical clustering is a popular heuristic technique for cluster analysis. The algorithm is based on pairwise distance information among the items being clustered. Alternatively, we propose to a new cluster analysis method based for formal probability distributions.  Our method uses the same pairwise distance information and leverages the Ewens-Pitman Attraction (EPA) distribution (Dahl, et al., 2017) to form cluster estimates and quantification of uncertainty.  Specifically, we propose to use the EPA distribution to simulate samples from the clustering distribution. We examine and compare a variety of methods for partition estimation which use only a pairwise probability matrix. We compare this new clustering methodology and estimation methods to existing procedures, and characterize the similarities and differences between our distribution-based clustering procedure and traditional hierarchical clustering.