Incorporating Concreteness in Multi-Modal Language Models With Curriculum Learning

Sezerer, Erhan; Tekir, Selma

Please use this identifier to cite or link to this item: https://hdl.handle.net/11147/11404

Title:	Incorporating Concreteness in Multi-Modal Language Models With Curriculum Learning
Authors:	Sezerer, Erhan Tekir, Selma
Keywords:	Multi-modal dataset Wikimedia Commons Multi-modal language model Concreteness Curriculum learning
Publisher:	MDPI
Abstract:	Over the last few years, there has been an increase in the studies that consider experiential (visual) information by building multi-modal language models and representations. It is shown by several studies that language acquisition in humans starts with learning concrete concepts through images and then continues with learning abstract ideas through the text. In this work, the curriculum learning method is used to teach the model concrete/abstract concepts through images and their corresponding captions to accomplish multi-modal language modeling/representation. We use the BERT and Resnet-152 models on each modality and combine them using attentive pooling to perform pre-training on the newly constructed dataset, which is collected from the Wikimedia Commons based on concrete/abstract words. To show the performance of the proposed model, downstream tasks and ablation studies are performed. The contribution of this work is two-fold: A new dataset is constructed from Wikimedia Commons based on concrete/abstract words, and a new multi-modal pre-training approach based on curriculum learning is proposed. The results show that the proposed multi-modal pre-training approach contributes to the success of the model.
URI:	https://doi.org/10.3390/app11178241 https://hdl.handle.net/11147/11404
ISSN:	2076-3417
Appears in Collections:	Computer Engineering / Bilgisayar Mühendisliği Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection

Files in This Item:

File	Size	Format
applsci-11-08241.pdf	1 MB	Adobe PDF	View/Open

Show full item record

CORE Recommender

SCOPUS^TM
Citations

1

checked on May 16, 2025

WEB OF SCIENCE^TM
Citations

1

checked on May 10, 2025

Page view(s)

722

checked on Jun 16, 2025

Download(s)

168

checked on Jun 16, 2025

Google Scholar^TM

Check

Files in This Item:

SCOPUSTM Citations

WEB OF SCIENCETM Citations

Page view(s)

Download(s)

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

WEB OF SCIENCE^TM
Citations

Google Scholar^TM