Please use this identifier to cite or link to this item:
Title: Çok-etiketli film türü sınıflandırması için Türkçe konu modellemesi veri kümesi
Other Titles: A Turkish topic modeling dataset for multi-label classification of movie genre
Authors: Jabrayilzade, Elgün
Poyraz Arslan, Algın
Para, Hasan
Polatbilek, Ozan
Sezerer, Erhan
Tekir, Selma
Keywords: Doc2Vec
Feed-forward neural networks
Long text classication
Short text classication
Text classication dataset
Issue Date: 2020
Publisher: Institute of Electrical and Electronics Engineers
Abstract: Statistical topic modeling aims to assign topics to documents in an unsupervised way. Latent Dirichlet Allocation (LDA) is the standard model for topic modeling. It shows good performance on document collections, documents being relatively long texts but it has poor performance on short texts. Topic modeling on short texts is on the rise due to the potential of social media. Thus, approaches that are able to nd topics on short texts as well as long texts are sought. However, there is a lack of datasets that include both long and short texts which have the same ground-truth categories. In this work, we release a Turkish movie dataset which contain both short lm descriptions and long subscripts where lm genre can be considered as topic. Furthermore, we provide multi-label movie genre classication results using a Feed Forward Neural Network (FFNN) taking LDA document-topic or Doc2Vec dense representations. © 2020 IEEE.
Description: 28th Signal Processing and Communications Applications Conference, SIU 2020 -- 5 October 2020 through 7 October 2020
ISBN: 9781728172064
Appears in Collections:Computer Engineering / Bilgisayar Mühendisliği
Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection
WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection

Files in This Item:
File SizeFormat 
A_Turkish_Topic.pdf223.29 kBAdobe PDFView/Open
Show full item record

CORE Recommender


checked on Feb 16, 2024

Page view(s)

checked on Feb 23, 2024


checked on Feb 23, 2024

Google ScholarTM



Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.