SocialHaterBERT: A dichotomous approach for automatically detecting hate speech on Twitter through textual analysis and user profiles

Valle Cano, Gloria del; Quijano Sánchez, Lara; Liberatore, Federico; Gómez, Jesús

Publication:
SocialHaterBERT: A dichotomous approach for automatically detecting hate speech on Twitter through textual analysis and user profiles

dc.affiliation.instituto	UC3M. Instituto UC3M - Santander de Big Data	es
dc.contributor.author	Valle Cano, Gloria del
dc.contributor.author	Quijano Sánchez, Lara
dc.contributor.author	Liberatore, Federico
dc.contributor.author	Gómez, Jesús
dc.contributor.funder	European Commission	es
dc.contributor.funder	Ministerio de Ciencia e Innovación (España)	es
dc.date.accessioned	2023-11-16T10:10:24Z
dc.date.available	2023-11-16T10:10:24Z
dc.date.issued	2023-04-15
dc.description.abstract	Social media platforms have evolved into an online representation of our social interactions. We may use the resources they provide to analyze phenomena that occur within them, such as the development and viralization of offensive and hostile content. In today's polarized world, the escalating nature of this behavior is cause for concern in modern society. This research includes an in-depth examination of previous efforts and strategies for detecting and preventing hateful content on the social network Twitter, as well as a novel classification approach based on users' profiles, related social environment and generated tweets. This paper's contribution is threefold: (i) an improvement in the performance of the HaterNet algorithm, an expert system developed in collaboration with the Spanish National Office Against Hate Crimes of the Spanish State Secretariat for Security (Ministry of the Interior) that is capable of identifying and monitoring the evolution of hate speech on Twitter using an LTSM + MLP neural network architecture. To that end, a model based on BERT, HaterBERT, has been created and tested using HaterNet's public dataset, providing results that show a significant improvement; (ii) A methodology to create a user database in the form of a relational network to infer textual and centrality features. This contribution, SocialGraph, has been independently tested with various traditional Machine Learning and Deep Learning algorithms, demonstrating its usefulness in spotting haters; (iii) a final model, SocialHaterBERT, that integrates the previous two approaches by analyzing features other than those inherent in the text. Experiment results reveal that this last contribution greatly improves outcomes, establishing a new field of study that transcends textual boundaries, paving the way for future research in coupled models from a diachronic and dynamic perspective.	en
dc.description.sponsorship	The research of Quijano-Sánchez was conducted with financial support from the Spanish Ministry of Science and Innovation, grant PID2019-108965GB-I00. The research of Liberatore is partially funded by the European Commission's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie, grant number MSCA-RISE 691161 (GEO-SAFE), and the Government of Spain , grant MTM2015-65803-R.	en
dc.description.status	Publicado	es
dc.format.extent	17
dc.identifier.bibliographicCitation	Expert Systems with Applications, (2023), 216:119446, (17 p.).	en
dc.identifier.doi	https://doi.org/10.1016/j.eswa.2022.119446
dc.identifier.issn	0957-4174
dc.identifier.publicationfirstpage	1
dc.identifier.publicationissue	119446
dc.identifier.publicationlastpage	17
dc.identifier.publicationtitle	EXPERT SYSTEMS WITH APPLICATIONS	en
dc.identifier.publicationvolume	216
dc.identifier.uri	https://hdl.handle.net/10016/38884
dc.identifier.uxxi	AR/0000033524
dc.language.iso	eng	en
dc.publisher	Elsevier	en
dc.relation.projectID	Gobierno de España. PID2019-108965GB-I00	es
dc.relation.projectID	Gobierno de España. MTM2015-65803-R	es
dc.rights	© 2022 The Author(s). Published by Elsevier Ltd.	en
dc.rights	This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/bync- nd/4.0/).	en
dc.rights	Atribución-NoComercial-SinDerivadas 3.0 España	*
dc.rights.accessRights	open access	en
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/es/	*
dc.subject.eciencia	Informática	es
dc.subject.other	Hate speech	en
dc.subject.other	Twitter	en
dc.subject.other	Deep learning	en
dc.subject.other	Social network analysis	en
dc.subject.other	Bidirectional encoder representations from transformers	en
dc.subject.other	BERT	en
dc.subject.other	Topic modeling	en
dc.title	SocialHaterBERT: A dichotomous approach for automatically detecting hate speech on Twitter through textual analysis and user profiles	en
dc.type	research article	*
dc.type.hasVersion	VoR	*
dspace.entity.type	Publication