dc.contributor.author |
Reviriego Vasallo, Pedro
|
dc.contributor.author |
Sánchez Macián, Alfonso |
dc.contributor.author |
Liu, Shanshan |
dc.contributor.author |
Lombardi, Fabrizio |
dc.date.accessioned |
2022-09-21T08:31:24Z |
dc.date.available |
2022-09-21T08:31:24Z |
dc.date.issued |
2021-07-28 |
dc.identifier.bibliographicCitation |
Reviriego, P., Sanchez-Macian, A., Liu, S. & Lombardi, F. (2021). On the Security of the K Minimum Values (KMV) Sketch. IEEE Transactions on Dependable and Secure Computing, 1-1. |
dc.identifier.issn |
1545-5971 |
dc.identifier.uri |
http://hdl.handle.net/10016/35750 |
dc.description.abstract |
Data sketches are widely used to accelerate operations in big data analytics. For example, algorithms use sketches to compute the cardinality of a set, or the similarity between two sets. Sketches achieve significant reductions in computing time and storage requirements by providing probabilistic estimates rather than exact values. In many applications, an estimate is sufficient and thus, it is possible to trade accuracy for computational complexity; this enables the use of probabilistic sketches. However, the use of probabilistic data structures may create security issues because an attacker may manipulate the data in such a way that the sketches produce an incorrect estimate. For example, an attacker could potentially inflate the estimate of the number of distinct users to increase its revenues or popularity. Recent works have shown that an attacker can manipulate Hyperloglog, a sketch widely used for cardinality estimate, with no knowledge of its implementation details. This paper considers the security of K Minimum Values (KMV), a sketch that is also widely used to implement both cardinality and similarity estimates. Next sections characterize vulnerabilities at an implementationindependent level, with attacks formulated as part of a novel adversary model that manipulates the similarity estimate. Therefore, the paper pursues an analysis and simulation; the results suggest that as vulnerable to attacks, an increase or reduction of the estimate may occur. The execution of the attacks against the KMV implementation in the Apache DataSketches library validates these scenarios. Experiments show an excellent agreement between theory and experimental results. |
dc.description.sponsorship |
Pedro Reviriego acknowledges the support of the ACHILLES project PID2019-104207RB-I00 and the Go2Edge network RED2018-102585-T funded by the Spanish Ministry of Economy and Competitivity and of the Madrid Community research project TAPIR-CM under Grant P2018/TCS-4496. |
dc.format.extent |
7 |
dc.language.iso |
eng |
dc.publisher |
IEEE |
dc.rights |
© 2021 IEEE |
dc.subject.other |
Data sketches |
dc.subject.other |
Cardinality |
dc.subject.other |
Kmv |
dc.subject.other |
Similarity |
dc.subject.other |
Security |
dc.subject.other |
Attack |
dc.title |
On the Security of the K Minimum Values (KMV) Sketch |
dc.type |
article |
dc.subject.eciencia |
Informática |
dc.identifier.doi |
https://doi.org/10.1109/TDSC.2021.3101280 |
dc.rights.accessRights |
openAccess |
dc.relation.projectID |
Gobierno de España. PID2019-104207RB-I00 |
dc.relation.projectID |
Gobierno de España. RED2018-102585-T |
dc.relation.projectID |
Gobierno de España. P2018/TCS-4496 |
dc.type.version |
acceptedVersion |
dc.identifier.publicationfirstpage |
3539 |
dc.identifier.publicationlastpage |
3545 |
dc.identifier.publicationtitle |
IEEE Transactions on Dependable and Secure Computing |
dc.identifier.uxxi |
AR/0000028231 |
dc.contributor.funder |
Comunidad de Madrid |
dc.contributor.funder |
Ministerio de Economía y Competitividad (España) |
dc.affiliation.dpto |
UC3M. Departamento de Ingeniería Telemática |
dc.affiliation.grupoinv |
UC3M. Grupo de Investigación: Network Technologies |