Quote Detection: a New Task and Dataset for Nlp

Tekir, S.; Güzel, A.; Tenekeci, S.; Haman, B.U.

Please use this identifier to cite or link to this item: https://hdl.handle.net/11147/14206

Title:	Quote Detection: a New Task and Dataset for Nlp
Authors:	Tekir, S. Güzel, A. Tenekeci, S. Haman, B.U.
Keywords:	Computational linguistics Natural language processing systems Auto-regressive Extractive summarizations Fine tuning Gain insight News summarization Performance Qualitative analysis Random fields Sequence models Random processes
Publisher:	Association for Computational Linguistics
Abstract:	Quotes are universally appealing. Humans recognize good quotes and save them for later reference. However, it may pose a challenge for machines. In this work, we build a new corpus of quotes and propose a new task, quote detection, as a type of span detection. We retrieve the quote set from Goodreads and collect the spans through a custom search on the Gutenberg Book Corpus. We run two types of baselines for quote detection: Conditional random field (CRF) and summarization with pointer-generator networks and Bidirectional and Auto-Regressive Transformers (BART). The results show that the neural sequence-to-sequence models perform substantially better than CRF. From the viewpoint of neural extractive summarization, quote detection seems easier than news summarization. Moreover, model fine-tuning on our corpus and the Cornell Movie-Quotes Corpus introduces incremental performance boosts. Finally, we provide a qualitative analysis to gain insight into the performance. © 2023 Association for Computational Linguistics.
Description:	7th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, LaTeCH-CLfL 2023 -- 5 May 2023 -- 192793
URI:	https://hdl.handle.net/11147/14206
ISBN:	9781959429548
Appears in Collections:	Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection

Show full item record

CORE Recommender

Page view(s)

146

checked on Jun 16, 2025

Google Scholar^TM

Check

Page view(s)

Google ScholarTM

Altmetric

Google Scholar^TM