Publication:
Intelligent Android malware family classification using Genetic Algorithms and SVM

dc.contributor.advisorIsasi, Pedro
dc.contributor.advisorSáez Achaerandio, Yago
dc.contributor.authorYuste Fernández-Alonso, Sara
dc.contributor.departamentoUC3M. Departamento de Informáticaes
dc.date.accessioned2020-02-24T16:11:18Z
dc.date.available2020-02-24T16:11:18Z
dc.date.issued2019-07
dc.date.submitted2019-10-14
dc.description.abstractAs of April 2019, Android was the most popular mobile operating system amongst smartphone users[1]. Its high popularity, combined with the extended use of smartphones for everyday tasks as well as storing or accessing sensitive and personal data, has made Android applications the target of numerous malware attacks over the last few years and in the present. The malware attacks have been perfected to target specific vulnerabilities in the operating system or the user; thus specializing in types of malware and families within each type. The malware is usually distributed in infected applications (or APKs), which contain malicious behaviours that can be found looking into their code (known as static analysis) or analysing the behaviour of the application while running (known as dynamic analysis). This document describes the implementation of an intelligent system that aims to classify a series of malicious APK samples obtained from the free repository ContagioDump. These samples are classified inside the type and family they belong to. To create the classifier system, a Support Vector Machine (SVM) is implemented using Python’s library Scikit Learn. A series of attributes are extracted from the samples of malicious APK by analysing the code of the APKs via static analysis, using Python’s library Androguard, which contains a parser that allows to interact with all the relevant parts of the APK file. The attributes obtained are very high in number, and for that reason a Genetic Algorithm is used to optimize the attributes that the SVM uses in the learning process. The algorithm codifies a subset of attributes from all the attributes extracted in the static analysis, and is evaluated using the accuracy score obtained when training the SVM with said subset. As a result, a subset of attributes and a trained model for the classification are obtained. This model is then tested with a new set of malware samples, belonging to all the families classified in the learning. The present document contains the explanation of the process of designing, creating and testing the system. It is developed as bachelor’s thesis for computer science and engineering degree in Universidad Carlos III de Madrid.es
dc.description.degreeIngeniería en Tecnologías de Telecomunicación (Plan 2010)es
dc.identifier.urihttps://hdl.handle.net/10016/29770
dc.language.isoenges
dc.rightsAtribución-NoComercial-SinDerivadas 3.0 España*
dc.rights.accessRightsopen accesses
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/*
dc.subject.ecienciaInformáticaes
dc.subject.otherGenetic algorithmses
dc.subject.otherNeural networkses
dc.subject.otherSupport Vector Machine (SVM)es
dc.subject.otherAndroid (Operating system)es
dc.subject.otherMalwarees
dc.subject.otherArtificial Intelligencees
dc.titleIntelligent Android malware family classification using Genetic Algorithms and SVMes
dc.typebachelor thesis*
dspace.entity.typePublication
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
TFG_Sara-Yuste_Fernandez_Alonso.pdf
Size:
3.05 MB
Format:
Adobe Portable Document Format
Description:
TFG