Publication:
Feature selection using correlation analysis and principal component analysis for accurate breast cancer diagnosis

dc.affiliation.dptoUC3M. Departamento de Informáticaes
dc.affiliation.grupoinvUC3M. Grupo de Investigación: Inteligencia Artificial Aplicada (GIAA)es
dc.contributor.authorIbrahim, Sara
dc.contributor.authorNazir, Saima
dc.contributor.authorVelastin Carroza, Sergio Alejandro
dc.date.accessioned2021-11-03T08:31:44Z
dc.date.available2021-11-03T08:31:44Z
dc.date.issued2021-11
dc.description.abstractBreast cancer is one of the leading causes of death among women, more so than all other cancers. The accurate diagnosis of breast cancer is very difficult due to the complexity of the disease, changing treatment procedures and different patient population samples. Diagnostic techniques with better performance are very important for personalized care and treatment and to reduce and control the recurrence of cancer. The main objective of this research was to select feature selection techniques using correlation analysis and variance of input features before passing these significant features to a classification method. We used an ensemble method to improve the classification of breast cancer. The proposed approach was evaluated using the public WBCD dataset (Wisconsin Breast Cancer Dataset). Correlation analysis and principal component analysis were used for dimensionality reduction. Performance was evaluated for well-known machine learning classifiers, and the best seven classifiers were chosen for the next step. Hyper-parameter tuning was performed to improve the performances of the classifiers. The best performing classification algorithms were combined with two different voting techniques. Hard voting predicts the class that gets the majority vote, whereas soft voting predicts the class based on highest probability. The proposed approach performed better than state-of-the-art work, achieving an accuracy of 98.24%, high precision (99.29%) and a recall value of 95.89%.en
dc.format.extent16
dc.identifier.bibliographicCitationIbrahim, S., Nazir, S. & Velastin, S. A. (2021). Feature Selection Using Correlation Analysis and Principal Component Analysis for Accurate Breast Cancer Diagnosis. Journal of Imaging, 7(11), 225.en
dc.identifier.doihttps://doi.org/10.3390/jimaging7110225
dc.identifier.issn2313-433X
dc.identifier.publicationfirstpage225
dc.identifier.publicationissue11
dc.identifier.publicationtitleJournal of Imagingen
dc.identifier.publicationvolume7
dc.identifier.urihttp://hdl.handle.net/10016/33516
dc.identifier.uxxiAR/0000028513
dc.language.isoeng
dc.publisherMDPI
dc.rights© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.en
dc.rightsAtribución 3.0 España*
dc.rights.accessRightsopen accessen
dc.rights.urihttp://creativecommons.org/licenses/by/3.0/es/*
dc.subject.ecienciaInformáticaes
dc.subject.otherBreast cancer diagnosisen
dc.subject.otherWisconsin breast cancer dataseten
dc.subject.otherFeature selectionen
dc.subject.otherDimensionality reductionen
dc.subject.otherPrincipal component analysisen
dc.subject.otherEnsemble methoden
dc.titleFeature selection using correlation analysis and principal component analysis for accurate breast cancer diagnosisen
dc.typeresearch article*
dc.type.hasVersionVoR*
dspace.entity.typePublication
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Feature_JI_2021.pdf
Size:
662.06 KB
Format:
Adobe Portable Document Format