Please use this identifier to cite or link to this item: https://hdl.handle.net/11147/14785
Title: THQuAD: Turkish Historic Question Answering Dataset for Reading Comprehension
Authors: Soygazi,F.
Çiftçi,O.
Kök,U.
Cengiz,S.
Keywords: Contextualized word embeddings
Deep learning
Information retrieval
Natural language understanding
Question answering
Publisher: Institute of Electrical and Electronics Engineers Inc.
Abstract: Question answering(QA) is a field in natural language processing and information retrieval, it aims to give answers to the questions using natural language. In this paper, we present the Turkish question answering dataset, which is THQuAD and baseline results with contextualized word embeddings. THQuAD consists of two different datasets one of them is TQuad on Turkish Islamic Science history within the scope of Teknofest 2018 "Artificial Intelligence competition", the second dataset on Ottoman history within the scope of Teknofest 2020 "Dogal Dil íçleme Yarismasi" prepared by us. THQuAD is a reading comprehension dataset, consisting of questions, answers, and passages. Our objective is to give an answer to a specific question by understanding the passage and extracting the answer from this passage. We generate contextualized word embeddings from pre-trained Turkish Bert, Electra, Albert language models after fine-tuning on different hyperparameters with neural networks. © 2021 IEEE
URI: https://doi.org/10.1109/UBMK52708.2021.9559013
https://hdl.handle.net/11147/14785
ISBN: 978-166542908-5
Appears in Collections:Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection

Show full item record



CORE Recommender

SCOPUSTM   
Citations

11
checked on Nov 15, 2024

Page view(s)

28
checked on Nov 18, 2024

Google ScholarTM

Check




Altmetric


Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.