Please use this identifier to cite or link to this item: https://hdl.handle.net/11147/14206
Full metadata record
DC FieldValueLanguage
dc.contributor.authorTekir, S.-
dc.contributor.authorGüzel, A.-
dc.contributor.authorTenekeci, S.-
dc.contributor.authorHaman, B.U.-
dc.date.accessioned2024-01-06T07:22:37Z-
dc.date.available2024-01-06T07:22:37Z-
dc.date.issued2023-
dc.identifier.isbn9781959429548-
dc.identifier.urihttps://hdl.handle.net/11147/14206-
dc.description7th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, LaTeCH-CLfL 2023 -- 5 May 2023 -- 192793en_US
dc.description.abstractQuotes are universally appealing. Humans recognize good quotes and save them for later reference. However, it may pose a challenge for machines. In this work, we build a new corpus of quotes and propose a new task, quote detection, as a type of span detection. We retrieve the quote set from Goodreads and collect the spans through a custom search on the Gutenberg Book Corpus. We run two types of baselines for quote detection: Conditional random field (CRF) and summarization with pointer-generator networks and Bidirectional and Auto-Regressive Transformers (BART). The results show that the neural sequence-to-sequence models perform substantially better than CRF. From the viewpoint of neural extractive summarization, quote detection seems easier than news summarization. Moreover, model fine-tuning on our corpus and the Cornell Movie-Quotes Corpus introduces incremental performance boosts. Finally, we provide a qualitative analysis to gain insight into the performance. © 2023 Association for Computational Linguistics.en_US
dc.language.isoenen_US
dc.publisherAssociation for Computational Linguisticsen_US
dc.relation.ispartofEACL 2023 - 7th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, Proceedings of LaTeCH-CLfL 2023en_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectComputational linguisticsen_US
dc.subjectNatural language processing systemsen_US
dc.subjectAuto-regressiveen_US
dc.subjectExtractive summarizationsen_US
dc.subjectFine tuningen_US
dc.subjectGain insighten_US
dc.subjectNews summarizationen_US
dc.subjectPerformanceen_US
dc.subjectQualitative analysisen_US
dc.subjectRandom fieldsen_US
dc.subjectSequence modelsen_US
dc.subjectRandom processesen_US
dc.titleQuote Detection: A New Task and Dataset for NLPen_US
dc.typeConference Objecten_US
dc.institutionauthor-
dc.departmentİzmir Institute of Technologyen_US
dc.identifier.startpage21en_US
dc.identifier.endpage27en_US
dc.identifier.scopus2-s2.0-85175428867en_US
dc.relation.publicationcategoryKonferans Öğesi - Uluslararası - Kurum Öğretim Elemanıen_US
dc.authorscopusid16234844500-
dc.authorscopusid58675151700-
dc.authorscopusid57340107000-
dc.authorscopusid58675886200-
dc.identifier.wosqualityN/A-
dc.identifier.scopusqualityN/A-
item.grantfulltextnone-
item.languageiso639-1en-
item.cerifentitytypePublications-
item.openairetypeConference Object-
item.fulltextNo Fulltext-
item.openairecristypehttp://purl.org/coar/resource_type/c_18cf-
crisitem.author.dept03.04. Department of Computer Engineering-
Appears in Collections:Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection
Show simple item record



CORE Recommender

Page view(s)

82
checked on Nov 4, 2024

Google ScholarTM

Check




Altmetric


Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.