Towards the Construction of a Software Benchmarking Dataset Via Systematic Literature Review

Yurum, Ozan Rasit; Unlu, Huseyin; Demirors, Onur

Please use this identifier to cite or link to this item: https://hdl.handle.net/11147/15507

Title:	Towards the Construction of a Software Benchmarking Dataset Via Systematic Literature Review
Authors:	Yurum, Ozan Rasit Unlu, Huseyin Demirors, Onur
Keywords:	Software Size Measurement Effort Estimation Dataset Benchmarking Cosmic Systematic Literature Review
Publisher:	IEEE
Series/Report no.:	Euromicro Conference on Software Engineering and Advanced Applications
Abstract:	Effort estimation is a fundamental task during the planning of software projects. Prediction models usually rely on two essential factors: software size and effort data. Measuring the size of the software can be done at various stages of the project with desired accuracy. Nevertheless, the industry faces challenges when it comes to collecting reliable actual effort data. Consequently, organizations encounter difficulties in establishing effort prediction models. Benchmarking datasets are available, but, in most cases, they have huge variances that make them less useful for effort prediction. In this study, we aimed to answer whether creating a software benchmarking dataset is possible by gathering the data from the literature. To the best of our knowledge, a comprehensive dataset that gathers the functional size and effort data of the studies from the literature is unavailable. For this purpose, we performed a systematic literature review to find studies that include projects measured with the COSMIC Functional Size Measurement (FSM) method and the related effort. As a result, we formed a dataset including 337 records from 18 studies that shared the corresponding size and effort data. Although we performed a limited search, we created a larger dataset than many datasets in the literature. In light of our review, we obtained that most studies did not share their dataset, and many lacked case details such as implementation environment and the scope of software development life cycle activities included in the effort data. We also compared the dataset with the ISBSG repository and found that our dataset has less variation in productivity. Our review showed the applicability of creating a software benchmarking dataset is possible by gathering the data from the literature. In conclusion, this study addresses gaps in the literature through a cost-free and easily extendable dataset.
URI:	https://doi.org/10.1109/SEAA64295.2024.00037 https://hdl.handle.net/11147/15507
ISBN:	9798350380279 9798350380262
ISSN:	2640-592X
Appears in Collections:	Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection

Show full item record

CORE Recommender

Google Scholar^TM

Check

Google ScholarTM

Altmetric

Google Scholar^TM