Please use this identifier to cite or link to this item: https://hdl.handle.net/11147/14206
Full metadata record
DC FieldValueLanguage
dc.contributor.authorTekir, S.-
dc.contributor.authorGüzel, A.-
dc.contributor.authorTenekeci, S.-
dc.contributor.authorHaman, B.U.-
dc.date.accessioned2024-01-06T07:22:37Z-
dc.date.available2024-01-06T07:22:37Z-
dc.date.issued2023-
dc.identifier.isbn9781959429548-
dc.identifier.urihttps://hdl.handle.net/11147/14206-
dc.description7th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, LaTeCH-CLfL 2023 -- 5 May 2023 -- 192793en_US
dc.description.abstractQuotes are universally appealing. Humans recognize good quotes and save them for later reference. However, it may pose a challenge for machines. In this work, we build a new corpus of quotes and propose a new task, quote detection, as a type of span detection. We retrieve the quote set from Goodreads and collect the spans through a custom search on the Gutenberg Book Corpus. We run two types of baselines for quote detection: Conditional random field (CRF) and summarization with pointer-generator networks and Bidirectional and Auto-Regressive Transformers (BART). The results show that the neural sequence-to-sequence models perform substantially better than CRF. From the viewpoint of neural extractive summarization, quote detection seems easier than news summarization. Moreover, model fine-tuning on our corpus and the Cornell Movie-Quotes Corpus introduces incremental performance boosts. Finally, we provide a qualitative analysis to gain insight into the performance. © 2023 Association for Computational Linguistics.en_US
dc.language.isoenen_US
dc.publisherAssociation for Computational Linguisticsen_US
dc.relation.ispartofEACL 2023 - 7th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, Proceedings of LaTeCH-CLfL 2023en_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectComputational linguisticsen_US
dc.subjectNatural language processing systemsen_US
dc.subjectAuto-regressiveen_US
dc.subjectExtractive summarizationsen_US
dc.subjectFine tuningen_US
dc.subjectGain insighten_US
dc.subjectNews summarizationen_US
dc.subjectPerformanceen_US
dc.subjectQualitative analysisen_US
dc.subjectRandom fieldsen_US
dc.subjectSequence modelsen_US
dc.subjectRandom processesen_US
dc.titleQuote Detection: a New Task and Dataset for Nlpen_US
dc.typeConference Objecten_US
dc.institutionauthor-
dc.departmentİzmir Institute of Technologyen_US
dc.identifier.startpage21en_US
dc.identifier.endpage27en_US
dc.identifier.scopus2-s2.0-85175428867en_US
dc.relation.publicationcategoryKonferans Öğesi - Uluslararası - Kurum Öğretim Elemanıen_US
dc.authorscopusid16234844500-
dc.authorscopusid58675151700-
dc.authorscopusid57340107000-
dc.authorscopusid58675886200-
item.openairetypeConference Object-
item.cerifentitytypePublications-
item.fulltextNo Fulltext-
item.openairecristypehttp://purl.org/coar/resource_type/c_18cf-
item.grantfulltextnone-
item.languageiso639-1en-
crisitem.author.dept03.04. Department of Computer Engineering-
Appears in Collections:Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection
Show simple item record



CORE Recommender

Google ScholarTM

Check




Altmetric


Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.