Intelligent Android malware family classification using Genetic Algorithms and SVM

e-Archivo Repository

Show simple item record

dc.contributor.advisor Isasi, Pedro
dc.contributor.advisor Sáez Achaerandio, Yago Yuste Fernández-Alonso, Sara 2020-02-24T16:11:18Z 2020-02-24T16:11:18Z 2019-07 2019-10-14
dc.description.abstract As of April 2019, Android was the most popular mobile operating system amongst smartphone users[1]. Its high popularity, combined with the extended use of smartphones for everyday tasks as well as storing or accessing sensitive and personal data, has made Android applications the target of numerous malware attacks over the last few years and in the present. The malware attacks have been perfected to target specific vulnerabilities in the operating system or the user; thus specializing in types of malware and families within each type. The malware is usually distributed in infected applications (or APKs), which contain malicious behaviours that can be found looking into their code (known as static analysis) or analysing the behaviour of the application while running (known as dynamic analysis). This document describes the implementation of an intelligent system that aims to classify a series of malicious APK samples obtained from the free repository ContagioDump. These samples are classified inside the type and family they belong to. To create the classifier system, a Support Vector Machine (SVM) is implemented using Python’s library Scikit Learn. A series of attributes are extracted from the samples of malicious APK by analysing the code of the APKs via static analysis, using Python’s library Androguard, which contains a parser that allows to interact with all the relevant parts of the APK file. The attributes obtained are very high in number, and for that reason a Genetic Algorithm is used to optimize the attributes that the SVM uses in the learning process. The algorithm codifies a subset of attributes from all the attributes extracted in the static analysis, and is evaluated using the accuracy score obtained when training the SVM with said subset. As a result, a subset of attributes and a trained model for the classification are obtained. This model is then tested with a new set of malware samples, belonging to all the families classified in the learning. The present document contains the explanation of the process of designing, creating and testing the system. It is developed as bachelor’s thesis for computer science and engineering degree in Universidad Carlos III de Madrid.
dc.language.iso eng
dc.rights Atribución-NoComercial-SinDerivadas 3.0 España
dc.subject.other Genetic algorithms
dc.subject.other Neural networks
dc.subject.other Support Vector Machine (SVM)
dc.subject.other Android (Operating system)
dc.subject.other Malware
dc.subject.other Artificial Intelligence
dc.title Intelligent Android malware family classification using Genetic Algorithms and SVM
dc.type bachelorThesis
dc.subject.eciencia Informática
dc.rights.accessRights openAccess Ingeniería en Tecnologías de Telecomunicación (Plan 2010)
dc.contributor.departamento Universidad Carlos III de Madrid. Departamento de Informática
 Find Full text

Files in this item

*Click on file's image for preview. (Embargoed files's preview is not supported)

The following license files are associated with this item:

This item appears in the following Collection(s)

Show simple item record