Deep learning for information extraction in the biomedical domain

dc.contributor.advisorSegura-Bedmar, Isabel
dc.contributor.authorSuárez Paniagua, Víctor
dc.contributor.departamentoUC3M. Departamento de Informáticaes
dc.contributor.funderMinisterio de Economía y Competitividad (España)es
dc.descriptionMención Internacional en el título de doctor
dc.description.abstractThe main hypothesis of this PhD dissertation is that novel Deep Learning algorithms can outperform classical Machine Learning methods for the task of Information Extraction in the Biomedical Domain. Contrary to classical systems, Deep Learning models can learn the representation of the data automatically without an expert domain knowledge and avoid the tedious and time-consuming task of defining relevant features. A Drug-Drug Interaction (DDI), which is an essential subset of Adverse Drug Reaction (ADR), represents the alterations in the effects of drugs that were taken simultaneously. The early recognition of interacting drugs is a vital process that prevents serious health problems that can cause death in the worst cases. Health-care professionals and researchers in this domain find the task of discovering information about these incidents very challenging due to the vast number of pharmacovigilance documents. For this reason, several shared tasks and datasets have been developed in order to solve this issue with automated annotation systems with the capability to extract this information. In the present document, the DDI corpus, which is an annotated dataset of DDIs, is used with Deep Learning architectures without any external information for the tasks of Name Entity Recognition and Relation Extraction in order to validate the hypothesis. Furthermore, some other datasets are tested to evidence the performance of these systems. To sum up, the results suggest that the most common Deep Learning methods like Convolutional Neural Networks and Recurrent Neural Networks overcome the traditional algorithms concluding that Deep Learning is a real alternative for a specific and complex scenario like the Information Extraction in the Biomedical domain. As a final goal, a complete architecture that covers the two tasks is developed to structure the named entities and their relationships from raw pharmacological texts.en
dc.description.degreePrograma de Doctorado en Ciencia y Tecnología Informática por la Universidad Carlos III de Madrides
dc.description.responsabilityPresidente: Ricardo Aler Mur.- Secretario: Alberto Díaz Esteban.- Vocal: María Herrero Zazoes
dc.description.sponsorshipThis thesis has been supported by: Pre-doctoral research training scholarship of the Carlos III University of Madrid (PIF UC3M 02-1415) for four years. Research Program of the Ministry of Economy and Competitiveness - Government of Spain, (DeepEMR project TIN2017-87548-C2-1-R). Research Program of the Ministry of Economy and Competitiveness - Government of Spain, (eGovernAbility-Access project TIN2014-52665-C2-2-R). Doctoral stay TEAM - Technologies for information and communication, Europe - east Asia Mobilities project (Erasmus Mundus Action 2-Strand 2 Programme) funded by the European Commission realized in the University of Tokyo, Japan, for the Aizawa Laboratory in National Institute of Informatics (NII) for seven months.en
dc.relation.projectIDGobierno de España. TIN2017-87548-C2-1-Res
dc.relation.projectIDGobierno de España. TIN2014-52665-C2-2-Res
dc.rightsAtribución-NoComercial-SinDerivadas 3.0 España*
dc.rights.accessRightsopen accessen
dc.subject.otherDeep learning algorithmsen
dc.subject.otherMachine learning methodsen
dc.subject.otherInformation extractionen
dc.subject.otherNeural networksen
dc.subject.otherBiomedical named recognitionen
dc.titleDeep learning for information extraction in the biomedical domainen
dc.typedoctoral thesis*
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
29.4 MB
Adobe Portable Document Format