Evaluation of machine learning methods in Weka

e-Archivo Repository

Show simple item record

dc.contributor.advisor Peláez Moreno, Carmen
dc.contributor.advisor Valverde Albacete, Francisco José
dc.contributor.author Pastor Valles, Antonio Ángel
dc.date.accessioned 2016-02-22T08:38:21Z
dc.date.available 2016-02-22T08:38:21Z
dc.date.issued 2015-09
dc.date.submitted 2015-10-15
dc.identifier.uri http://hdl.handle.net/10016/22339
dc.description.abstract This document presents a software plug-in to endow the Weka machine learning suite with the complete set of information-theoretic tools described by Valverde-Albacete and Peláez-Moreno [Pattern Recognition Letters 31.12 (2010) and PLoS ONE 9.1 (2014)]. The utility of these tools is more evident in multi-class classification, but they can be used as well for binary tasks. The Entropy Triangle is an exploratory analysis method that we implemented as an interactive visualization plugin forWeka. The Entropy Triangle represents in a De Finetti diagram, or ternary plot, a balance equation of entropies for the estimated distributions of the input and the output of classifiers. This diagram provides, at a glance, complete information of the confusion matrix in terms of information theory. Besides the Entropy Triangle, we implement in the package some useful metrics for the assessment of classifiers based on the perplexity. In the context of classification, the perplexity represents the effective number of classes for the classification task, which makes it a useful measure of the propagation of information. Among these metrics, we highlight the Entropy Modified Accuracy, recommended to rank classifiers, and the Normalized Information Transfer factor, to measure the classifiers level of understanding of the underlying patterns of the task. The Waikato Environment for Knowledge Analysis (WEKA) is a workbench for machine learning and data mining developed at the University of Waikato, New Zealand. Weka has different Graphical User Interfaces available, that let the user choose from an user friendly interactive explorer, to an automated approach where multiple experiments can be statistically compared at the same time. An important feature of Weka is the possibility to use it as a framework for the implementation of algorithms, evaluation metrics and visualization tools by means of added components. In this document we describe the design and development of the software package. Before that, we set the theoretical backdrop reviewing the implemented tools and their mathematical background. To illustrate the software features and the utility of the tools, we present an example with a multi-class dataset in which we unbalance the class distribution in different ways. Additionally, we introduce how to use the plug-in programmatically with a guided example. Finally, we review the project in hindsight and propose future work.
dc.format.mimetype application/pdf
dc.language.iso eng
dc.rights Atribución-NoComercial-SinDerivadas 3.0 España
dc.rights.uri http://creativecommons.org/licenses/by-nc-nd/3.0/es/
dc.subject.other Inteligencia artificial
dc.subject.other Aprendizaje
dc.subject.other Weka
dc.subject.other Clasificación
dc.title Evaluation of machine learning methods in Weka
dc.type bachelorThesis
dc.subject.eciencia Telecomunicaciones
dc.rights.accessRights openAccess
dc.description.degree Ingeniería en Tecnologías de Telecomunicación
dc.contributor.departamento Universidad Carlos III de Madrid. Departamento de Teoría de la Señal y Comunicaciones
 Find Full text

Files in this item

*Click on file's image for preview. (Embargoed files's preview is not supported)


The following license files are associated with this item:

This item appears in the following Collection(s)

Show simple item record