Incorporating Concreteness in Multi-Modal Language Models With Curriculum Learning

Sezerer, Erhan; Tekir, Selma

Please use this identifier to cite or link to this item: https://hdl.handle.net/11147/11404

Full metadata record

DC Field	Value	Language
dc.contributor.author	Sezerer, Erhan	-
dc.contributor.author	Tekir, Selma	-
dc.date.accessioned	2021-11-06T09:48:29Z	-
dc.date.available	2021-11-06T09:48:29Z	-
dc.date.issued	2021	-
dc.identifier.issn	2076-3417	-
dc.identifier.uri	https://doi.org/10.3390/app11178241	-
dc.identifier.uri	https://hdl.handle.net/11147/11404	-
dc.description.abstract	Over the last few years, there has been an increase in the studies that consider experiential (visual) information by building multi-modal language models and representations. It is shown by several studies that language acquisition in humans starts with learning concrete concepts through images and then continues with learning abstract ideas through the text. In this work, the curriculum learning method is used to teach the model concrete/abstract concepts through images and their corresponding captions to accomplish multi-modal language modeling/representation. We use the BERT and Resnet-152 models on each modality and combine them using attentive pooling to perform pre-training on the newly constructed dataset, which is collected from the Wikimedia Commons based on concrete/abstract words. To show the performance of the proposed model, downstream tasks and ablation studies are performed. The contribution of this work is two-fold: A new dataset is constructed from Wikimedia Commons based on concrete/abstract words, and a new multi-modal pre-training approach based on curriculum learning is proposed. The results show that the proposed multi-modal pre-training approach contributes to the success of the model.	en_US
dc.language.iso	en	en_US
dc.publisher	MDPI	en_US
dc.relation.ispartof	Applied Sciences	en_US
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.subject	Multi-modal dataset	en_US
dc.subject	Wikimedia Commons	en_US
dc.subject	Multi-modal language model	en_US
dc.subject	Concreteness	en_US
dc.subject	Curriculum learning	en_US
dc.title	Incorporating Concreteness in Multi-Modal Language Models With Curriculum Learning	en_US
dc.type	Article	en_US
dc.authorid	0000-0002-0488-9682	-
dc.department	İzmir Institute of Technology. Computer Engineering	en_US
dc.identifier.volume	11	en_US
dc.identifier.issue	17	en_US
dc.identifier.wos	WOS:000695573500001	-
dc.identifier.scopus	2-s2.0-85114487960	-
dc.relation.publicationcategory	Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı	en_US
dc.identifier.doi	10.3390/app11178241	-
dc.identifier.wosquality	Q2	-
dc.identifier.scopusquality	Q3	-
item.openairecristype	http://purl.org/coar/resource_type/c_18cf	-
item.languageiso639-1	en	-
item.openairetype	Article	-
item.grantfulltext	open	-
item.fulltext	With Fulltext	-
item.cerifentitytype	Publications	-
crisitem.author.dept	03.04. Department of Computer Engineering	-
crisitem.author.dept	03.04. Department of Computer Engineering	-
Appears in Collections:	Computer Engineering / Bilgisayar Mühendisliği Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection

Files in This Item:

File	Size	Format
applsci-11-08241.pdf	1 MB	Adobe PDF	View/Open

Show simple item record

CORE Recommender

SCOPUS^TM
Citations

1

checked on Mar 28, 2025

WEB OF SCIENCE^TM
Citations

1

checked on Mar 29, 2025

Page view(s)

694

checked on Mar 31, 2025

Download(s)

158

checked on Mar 31, 2025

Google Scholar^TM

Check

Files in This Item:

SCOPUSTM Citations

WEB OF SCIENCETM Citations

Page view(s)

Download(s)

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

WEB OF SCIENCE^TM
Citations

Google Scholar^TM