Show simple item record

dc.contributor.authorKhalifa, Waleed
dc.contributor.authorYousef, Malik
dc.contributor.authorSaçar Demirci, Müşerref Duygu
dc.contributor.authorAllmer, Jens
dc.date.accessioned2017-06-28T08:33:00Z
dc.date.available2017-06-28T08:33:00Z
dc.date.issued2016
dc.identifier.citationKhalifa, W., Yousef, M., Saçar Demirci, M. D., and Allmer, J. (2016). The impact of feature selection on one and two-class classification performance for plant microRNAs. PeerJ, 2016(6). doi:10.7717/peerj.2135en_US
dc.identifier.issn2167-8359
dc.identifier.urihttp://doi.org/10.7717/peerj.2135
dc.identifier.urihttp://hdl.handle.net/11147/5794
dc.description.abstractMicroRNAs (miRNAs) are short nucleotide sequences that form a typical hairpin structure which is recognized by a complex enzyme machinery. It ultimately leads to the incorporation of 18-24 nt long mature miRNAs into RISC where they act as recognition keys to aid in regulation of target mRNAs. It is involved to determine miRNAs experimentally and, therefore, machine learning is used to complement such endeavors. The success of machine learning mostly depends on proper input data and appropriate features for parameterization of the data. Although, in general, two-class classification (TCC) is used in the field; because negative examples are hard to come by, one-class classification (OCC) has been tried for pre-miRNA detection. Since both positive and negative examples are currently somewhat limited, feature selection can prove to be vital for furthering the field of pre-miRNA detection. In this study, we compare the performance of OCC and TCC using eight feature selection methods and seven different plant species providing positive pre-miRNA examples. Feature selection was very successful for OCC where the best feature selection method achieved an average accuracy of 95.6%, thereby being ~29% better than the worst method which achieved 66.9% accuracy. While the performance is comparable to TCC, which performs up to 3% better than OCC, TCC is much less affected by feature selection and its largest performance gap is ~13% which only occurs for two of the feature selection methodologies. We conclude that feature selection is crucially important for OCC and that it can perform on par with TCC given the proper set of features.en_US
dc.description.sponsorshipThe Scientific and Technological Research Council of Turkey (grant number 113E326)en_US
dc.language.isoengen_US
dc.publisherPeerJ Inc.en_US
dc.relationinfo:eu-repo/grantAgreement/TUBITAK/EEEAG/113E326en_US
dc.relation.isversionof10.7717/peerj.2135en_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectFeature selectionen_US
dc.subjectMachine learningen_US
dc.subjectMicroRNAsen_US
dc.subjectPlant geneticsen_US
dc.subjectClassificationen_US
dc.titleThe impact of feature selection on one and two-class classification performance for plant microRNAsen_US
dc.typearticleen_US
dc.contributor.authorIDTR114170en_US
dc.contributor.authorIDTR107974en_US
dc.contributor.institutionauthorSaçar Demirci, Müşerref Duygu
dc.contributor.institutionauthorAllmer, Jens
dc.relation.journalPeerJen_US
dc.contributor.departmentİYTE, Fen Fakültesi, Moleküler Biyoloji ve Genetik Bölümüen_US
dc.identifier.volume2016en_US
dc.identifier.issue6en_US
dc.identifier.wosWOS:000378351000002
dc.identifier.scopusSCOPUS:2-s2.0-84977103713
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record