100% classification accuracy considered harmful: The normalized information transfer factor explains the accuracy paradox

e-Archivo Repository

Show simple item record

dc.contributor.author Valverde Albacete, Francisco José
dc.contributor.author Peláez Moreno, Carmen
dc.date.accessioned 2015-09-11T10:46:52Z
dc.date.available 2015-09-11T10:46:52Z
dc.date.issued 2014-01
dc.identifier.bibliographicCitation PLoS ONE (2014). 9(1), 10 p.
dc.identifier.issn 1932-6203
dc.identifier.uri http://hdl.handle.net/10016/21473
dc.description.abstract The most widely spread measure of performance, accuracy, suffers from a paradox: predictive models with a given level of accuracy may have greater predictive power than models with higher accuracy. Despite optimizing classification error rate, high accuracy models may fail to capture crucial information transfer in the classification task. We present evidence of this behavior by means of a combinatorial analysis where every possible contingency matrix of 2, 3 and 4 classes classifiers are depicted on the entropy triangle, a more reliable information-theoretic tool for classification assessment. Motivated by this, we develop from first principles a measure of classification performance that takes into consideration the information learned by classifiers. We are then able to obtain the entropy-modulated accuracy (EMA), a pessimistic estimate of the expected accuracy with the influence of the input distribution factored out, and the normalized information transfer factor (NIT), a measure of how efficient is the transmission of information from the input to the output set of classes. The EMA is a more natural measure of classification performance than accuracy when the heuristic to maximize is the transfer of information through the classifier instead of classification error count. The NIT factor measures the effectiveness of the learning process in classifiers and also makes it harder for them to "cheat" using techniques like specialization, while also promoting the interpretability of results. Their use is demonstrated in a mind reading task competition that aims at decoding the identity of a video stimulus based on magnetoencephalography recordings. We show how the EMA and the NIT factor reject rankings based in accuracy, choosing more meaningful and interpretable classifiers.
dc.description.sponsorship Francisco José Valverde-Albacete has been partially supported by EU FP7 project LiMoSINe (contract 288024): www.limosine-project.eu Carmen Peláez Moreno has been partially supported by the Spanish Government-Comisión Interministerial de Ciencia y Tecnología project TEC2011–26807.
dc.format.extent 10
dc.format.mimetype application/pdf
dc.language.iso eng
dc.publisher PLOS (Public Library of Science)
dc.rights Atribución 3.0 España
dc.rights.uri http://creativecommons.org/licenses/by/3.0/es/
dc.subject.other Classification accuracy
dc.subject.other Classifier
dc.subject.other Combinatorial analysis
dc.subject.other Contingency table
dc.subject.other Controlled study
dc.subject.other Entropy modulated accuracy
dc.subject.other Error
dc.subject.other Information
dc.subject.other Learning
dc.subject.other Magnetoencephalography
dc.subject.other Mathematical phenomena
dc.subject.other Mental task
dc.subject.other Model
dc.subject.other Normalized information transfer factor
dc.subject.other Prediction
dc.subject.other Theory
dc.subject.other Videorecording
dc.subject.other Visual stimulation
dc.subject.other Algorithms
dc.subject.other Humans
dc.subject.other Models
dc.subject.other Theoretical
dc.title 100% classification accuracy considered harmful: The normalized information transfer factor explains the accuracy paradox
dc.type article
dc.relation.publisherversion http://dx.doi.org/10.1371/journal.pone.0084217
dc.subject.eciencia Telecomunicaciones
dc.identifier.doi 10.1371/journal.pone.0084217
dc.rights.accessRights openAccess
dc.relation.projectID Gobierno de España. TEC2011–26807
dc.relation.projectID info:eu-repo/grantAgreement/EC/FP7/288024
dc.type.version publishedVersion
dc.identifier.publicationfirstpage 1
dc.identifier.publicationissue 1
dc.identifier.publicationlastpage 10
dc.identifier.publicationtitle PLOS ONE
dc.identifier.publicationvolume 9
dc.identifier.uxxi AR/0000014514
 Find Full text

Files in this item

*Click on file's image for preview. (Embargoed files's preview is not supported)


The following license files are associated with this item:

This item appears in the following Collection(s)

Show simple item record