BYU

Abstract by Piper Armstrong

Personal Infomation


Presenter's Name

Piper Armstrong

Degree Level

Masters

Co-Authors

Stephen Cowley
Wilson Fearn
Courtni Byun
Kevin Seppi

Abstract Infomation


Department

Computer Science

Faculty Advisor

Kevin Seppi

Title

The Importance of Importance: Improving Cross-Reference Candidate Suggestion with PageRank

Abstract

Cross-references are a useful tool for studying and understanding the content of a corpus. They allow users to find related sections of text, moreover, good cross-references focus on topically important sections of text. Prior work in automatic cross-referencing employs the Anchor Words algorithm and fine-grained topic modeling but addresses only the search for topically related documents. Our work focuses on the second element of good cross-references: the search for topically important cross-references. Our approach for finding important groups of cross-references uses a novel combination of clustering and the PageRank algorithm to assign degrees of importance to the documents in the corpus and then uses those ranks in conjunction with topical similarity to order candidate cross-reference pairs, resulting in significant improvement as compared to the use of topic similarity alone.