The Impact of Feature Selection on One and Two-Class Classification Performance for Plant Micrornas

Khalifa, Waleed; Yousef, Malik; Saçar Demirci, Müşerref Duygu; Allmer, Jens

Please use this identifier to cite or link to this item: https://hdl.handle.net/11147/5794

Title:	The Impact of Feature Selection on One and Two-Class Classification Performance for Plant Micrornas
Authors:	Khalifa, Waleed Yousef, Malik Saçar Demirci, Müşerref Duygu Allmer, Jens
Keywords:	Feature selection Machine learning MicroRNAs Plant genetics Classification
Publisher:	PeerJ Inc.
Source:	Khalifa, W., Yousef, M., Saçar Demirci, M. D., and Allmer, J. (2016). The impact of feature selection on one and two-class classification performance for plant microRNAs. PeerJ, 2016(6). doi:10.7717/peerj.2135
Abstract:	MicroRNAs (miRNAs) are short nucleotide sequences that form a typical hairpin structure which is recognized by a complex enzyme machinery. It ultimately leads to the incorporation of 18-24 nt long mature miRNAs into RISC where they act as recognition keys to aid in regulation of target mRNAs. It is involved to determine miRNAs experimentally and, therefore, machine learning is used to complement such endeavors. The success of machine learning mostly depends on proper input data and appropriate features for parameterization of the data. Although, in general, two-class classification (TCC) is used in the field; because negative examples are hard to come by, one-class classification (OCC) has been tried for pre-miRNA detection. Since both positive and negative examples are currently somewhat limited, feature selection can prove to be vital for furthering the field of pre-miRNA detection. In this study, we compare the performance of OCC and TCC using eight feature selection methods and seven different plant species providing positive pre-miRNA examples. Feature selection was very successful for OCC where the best feature selection method achieved an average accuracy of 95.6%, thereby being ~29% better than the worst method which achieved 66.9% accuracy. While the performance is comparable to TCC, which performs up to 3% better than OCC, TCC is much less affected by feature selection and its largest performance gap is ~13% which only occurs for two of the feature selection methodologies. We conclude that feature selection is crucially important for OCC and that it can perform on par with TCC given the proper set of features.
URI:	http://doi.org/10.7717/peerj.2135 http://hdl.handle.net/11147/5794
ISSN:	2167-8359
Appears in Collections:	Molecular Biology and Genetics / Moleküler Biyoloji ve Genetik PubMed İndeksli Yayınlar Koleksiyonu / PubMed Indexed Publications Collection Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection

Files in This Item:

File	Description	Size	Format
5794.pdf	Makale	709.24 kB	Adobe PDF	View/Open

Show full item record

CORE Recommender

SCOPUS^TM
Citations

10

checked on May 16, 2025

WEB OF SCIENCE^TM
Citations

12

checked on May 23, 2025

Page view(s)

344

checked on Jun 16, 2025

Download(s)

252

checked on Jun 16, 2025

Google Scholar^TM

Check

Files in This Item:

SCOPUSTM Citations

WEB OF SCIENCETM Citations

Page view(s)

Download(s)

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

WEB OF SCIENCE^TM
Citations

Google Scholar^TM