A Blocking-Based Approach to Enhance Large-Scale Reference Linking
Published: 2022 Juni
Buchtitel: Proceedings of the workshop on understanding literature references in academic full text (ULITE) at JCDL 2022
Analyses and applications based on bibliographic references are of ever increasing importance. However, reference linking methods described in the literature are only able to link around half of the references in papers. To improve the quality of reference linking in large scholarly data sets, we propose a blocking-based reference linking approach that utilizes a rich set of reference fields (title, author, journal, year, etc.) and is independent of a target collection of paper records to be linked to. We evaluate our approach on a corpus of 300,000 references. Relative to the original data, we achieve a 90% increase in papers linked through references, a five-fold increase in bibliographic coupling, and a nine-fold increase in in-text citations covered. The newly established links are of high quality (85% F1). We conclude that our proposed approach demonstrates a way towards better quality scholarly data.