Thquad: Turkish Historic Question Answering Dataset for Reading Comprehension

Soygazi,F.; Çiftçi,O.; Kök,U.; Cengiz,S.

Please use this identifier to cite or link to this item: https://hdl.handle.net/11147/14785

Title:	Thquad: Turkish Historic Question Answering Dataset for Reading Comprehension
Authors:	Soygazi,F. Çiftçi,O. Kök,U. Cengiz,S.
Keywords:	Contextualized word embeddings Deep learning Information retrieval Natural language understanding Question answering
Publisher:	Institute of Electrical and Electronics Engineers Inc.
Abstract:	Question answering(QA) is a field in natural language processing and information retrieval, it aims to give answers to the questions using natural language. In this paper, we present the Turkish question answering dataset, which is THQuAD and baseline results with contextualized word embeddings. THQuAD consists of two different datasets one of them is TQuad on Turkish Islamic Science history within the scope of Teknofest 2018 "Artificial Intelligence competition", the second dataset on Ottoman history within the scope of Teknofest 2020 "Dogal Dil íçleme Yarismasi" prepared by us. THQuAD is a reading comprehension dataset, consisting of questions, answers, and passages. Our objective is to give an answer to a specific question by understanding the passage and extracting the answer from this passage. We generate contextualized word embeddings from pre-trained Turkish Bert, Electra, Albert language models after fine-tuning on different hyperparameters with neural networks. © 2021 IEEE
URI:	https://doi.org/10.1109/UBMK52708.2021.9559013 https://hdl.handle.net/11147/14785
ISBN:	978-166542908-5
Appears in Collections:	Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection

Show full item record

CORE Recommender

SCOPUS^TM
Citations

15

checked on Mar 28, 2025

Page view(s)

86

checked on Mar 31, 2025

Google Scholar^TM

Check

SCOPUSTM Citations

Page view(s)

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

Google Scholar^TM