An analysis of android malware classification services

e-Archivo Repository

Show simple item record

dc.contributor.author Rashed, Mohammed Ahmed Fahim
dc.contributor.author Suárez de Tangil Rotaeche, Guillermo Nicolás
dc.date.accessioned 2022-03-15T08:27:33Z
dc.date.available 2022-03-15T08:27:33Z
dc.date.issued 2021-08
dc.identifier.bibliographicCitation Rashed, M., & Suarez-Tangil, G. (2021). An Analysis of Android Malware Classification Services. In Sensors (Vol. 21, Issue 16, p. 5671). MDPI AG.
dc.identifier.issn 1424-8220
dc.identifier.uri http://hdl.handle.net/10016/34372
dc.description.abstract The increasing number of Android malware forced antivirus (AV) companies to rely on automated classification techniques to determine the family and class of suspicious samples. The research community relies heavily on such labels to carry out prevalence studies of the threat ecosystem and to build datasets that are used to validate and benchmark novel detection and classification methods. In this work, we carry out an extensive study of the Android malware ecosystem by surveying white papers and reports from 6 key players in the industry, as well as 81 papers from 8 top security conferences, to understand how malware datasets are used by both. We, then, explore the limitations associated with the use of available malware classification services, namely VirusTotal (VT) engines, for determining the family of an Android sample. Using a dataset of 2.47 M Android malware samples, we find that the detection coverage of VT's AVs is generally very low, that the percentage of samples flagged by any 2 AV engines does not go beyond 52%, and that common families between any pair of AV engines is at best 29%. We rely on clustering to determine the extent to which different AV engine pairs agree upon which samples belong to the same family (regardless of the actual family name) and find that there are discrepancies that can introduce noise in automatic label unification schemes. We also observe the usage of generic labels and inconsistencies within the labels of top AV engines, suggesting that their efforts are directed towards accurate detection rather than classification. Our results contribute to a better understanding of the limitations of using Android malware family labels as supplied by common AV engines.
dc.description.sponsorship This work has been supported by the “Ramon y Cajal” Fellowship RYC-2020-029401.
dc.format.extent 31
dc.language.iso eng
dc.publisher MDPI
dc.rights © 2021 by the authors. Licensee MDPI, Basel, Switzerland.
dc.rights Atribución 3.0 España
dc.rights.uri http://creativecommons.org/licenses/by/3.0/es/
dc.subject.other Android
dc.subject.other Antivirus
dc.subject.other Classification
dc.subject.other Clustering
dc.subject.other Family
dc.subject.other Labels
dc.subject.other Malware
dc.subject.other Virustotal
dc.title An analysis of android malware classification services
dc.type article
dc.subject.eciencia Informática
dc.identifier.doi https://doi.org/10.3390/s21165671
dc.rights.accessRights openAccess
dc.relation.projectID Gobierno de España. RYC-2020-029401.
dc.type.version publishedVersion
dc.identifier.publicationfirstpage 5671
dc.identifier.publicationissue 16
dc.identifier.publicationlastpage 5702
dc.identifier.publicationtitle An analysis of android malware classification services
dc.identifier.publicationvolume 21
dc.identifier.uxxi AR/0000028920
dc.contributor.funder Ministerio de Economía y Competitividad (España)
 Find Full text

Files in this item

*Click on file's image for preview. (Embargoed files's preview is not supported)


The following license files are associated with this item:

This item appears in the following Collection(s)

Show simple item record