Abstract by Piper Armstrong

Personal Infomation

Presenter's Name

Piper Armstrong

Degree Level



Stephen Cowley
Wilson Fearn
Courtni Byun
Kevin Seppi

Abstract Infomation


Computer Science

Faculty Advisor

Kevin Seppi


The Importance of Importance: Improving Cross-Reference Candidate Suggestion with PageRank


Cross-references are a useful tool for studying and understanding the content of a corpus. They allow users to find related sections of text, moreover, good cross-references focus on topically important sections of text. Prior work in automatic cross-referencing employs the Anchor Words algorithm and fine-grained topic modeling but addresses only the search for topically related documents. Our work focuses on the second element of good cross-references: the search for topically important cross-references. Our approach for finding important groups of cross-references uses a novel combination of clustering and the PageRank algorithm to assign degrees of importance to the documents in the corpus and then uses those ranks in conjunction with topical similarity to order candidate cross-reference pairs, resulting in significant improvement as compared to the use of topic similarity alone.