Breadth analysis of Online Social Networks

e-Archivo Repository

Show simple item record

dc.contributor.advisor Sánchez, Angel
dc.contributor.advisor Fernández Anta, Antonio Chiroque Núñez, Luis Felipe 2022-05-10T09:53:11Z 2022-05-10T09:53:11Z 2021-09 2021-11-15
dc.description.abstract This thesis is mainly motivated by the analysis, understanding, and prediction of human behaviour by means of the study of their digital fingeprints. Unlike a classical PhD thesis, where you choose a topic and go further on a deep analysis on a research topic, we carried out a breadth analysis on the research topic of complex networks, such as those that humans create themselves with their relationships and interactions. These kinds of digital communities where humans interact and create relationships are commonly called Online Social Networks. Then, (i) we have collected their interactions, as text messages they share among each other, in order to analyze the sentiment and topic of such messages. We have basically applied the state-of-the-art techniques for Natural Language Processing, widely developed and tested on English texts, in a collection of Spanish Tweets and we compare the results. Next, (ii) we focused on Topic Detection, creating our own classifier and applying it to the former Tweets dataset. The breakthroughs are two: our classifier relies on text-graphs from the input text and we achieved a figure of 70% accuracy, outperforming previous results. After that, (iii) we moved to analyze the network structure (or topology) and their data values to detect outliers. We hypothesize that in social networks there is a large mass of users that behaves similarly, while a reduced set of them behave in a different way. However, specially among this last group, we try to separate those with high activity, or low activity, or any other paramater/feature that make them belong to different kind of outliers. We aim to detect influential users in one of these outliers set. We propose a new unsupervised method, Massive Unsupervised Outlier Detection (MUOD), labeling the outliers detected os of shape, magnitude, amplitude or combination of those. We applied this method to a subset of roughly 400 million Google+ users, identifying and discriminating automatically sets of outlier users. Finally, (iv) we find interesting to address the monitorization of real complex networks. We created a framework to dynamically adapt the temporality of large-scale dynamic networks, reducing compute overhead by at least 76%, data volume by 60% and overall cloud costs by at least 54%, while always maintaining accuracy above 88%.
dc.language.iso eng
dc.rights Atribución-NoComercial-SinDerivadas 3.0 España
dc.subject.other Biometric
dc.subject.other Fingerprint recognition
dc.subject.other Online social networks
dc.subject.other Natural language processing
dc.subject.other Massive Unsupervised Outlier Detection (MUOD)
dc.subject.other Online social networks
dc.subject.other Algorithms
dc.title Breadth analysis of Online Social Networks
dc.type doctoralThesis
dc.description.status Publicado
dc.subject.eciencia Informática
dc.rights.accessRights openAccess Programa de Doctorado en Ingeniería Matemática por la Universidad Carlos III de Madrid
dc.description.responsability Presidente: Rosa María Benito Zafrilla.- Secretario: Ángel Cuevas Rumín.- Vocal: José Ernesto Jiménez Merino
dc.contributor.departamento Universidad Carlos III de Madrid. Departamento de Matemáticas
dc.contributor.tutor Sánchez, Angel
 Find Full text

Files in this item

*Click on file's image for preview. (Embargoed files's preview is not supported)

The following license files are associated with this item:

This item appears in the following Collection(s)

Show simple item record