Please use this identifier to cite or link to this item:
https://hdl.handle.net/11147/14785
Title: | THQuAD: Turkish Historic Question Answering Dataset for Reading Comprehension | Authors: | Soygazi,F. Çiftçi,O. Kök,U. Cengiz,S. |
Keywords: | Contextualized word embeddings Deep learning Information retrieval Natural language understanding Question answering |
Publisher: | Institute of Electrical and Electronics Engineers Inc. | Abstract: | Question answering(QA) is a field in natural language processing and information retrieval, it aims to give answers to the questions using natural language. In this paper, we present the Turkish question answering dataset, which is THQuAD and baseline results with contextualized word embeddings. THQuAD consists of two different datasets one of them is TQuad on Turkish Islamic Science history within the scope of Teknofest 2018 "Artificial Intelligence competition", the second dataset on Ottoman history within the scope of Teknofest 2020 "Dogal Dil íçleme Yarismasi" prepared by us. THQuAD is a reading comprehension dataset, consisting of questions, answers, and passages. Our objective is to give an answer to a specific question by understanding the passage and extracting the answer from this passage. We generate contextualized word embeddings from pre-trained Turkish Bert, Electra, Albert language models after fine-tuning on different hyperparameters with neural networks. © 2021 IEEE | URI: | https://doi.org/10.1109/UBMK52708.2021.9559013 https://hdl.handle.net/11147/14785 |
ISBN: | 978-166542908-5 |
Appears in Collections: | Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection |
Show full item record
CORE Recommender
SCOPUSTM
Citations
11
checked on Nov 15, 2024
Page view(s)
28
checked on Nov 18, 2024
Google ScholarTM
Check
Altmetric
Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.