Department/Institute:
UC3M. Departamento de Informática
Degree:
Programa de Doctorado en Ciencia y Tecnología Informática por la Universidad Carlos III de Madrid
Issued date:
2019-03
Defense date:
2019-07-05
Committee:
Presidente: Ricardo Aler Mur.- Secretario: Alberto Díaz Esteban.- Vocal: María Herrero Zazo
xmlui.dri2xhtml.METS-1.0.item-contributor-funder:
Ministerio de Economía y Competitividad (España)
Sponsor:
This thesis has been supported by:
Pre-doctoral research training scholarship of the Carlos III University of Madrid (PIF UC3M 02-1415) for four years.
Research Program of the Ministry of Economy and Competitiveness - Government of Spain, (DeepEMR project TIN2017-87548-C2-1-R).
Research Program of the Ministry of Economy and Competitiveness - Government of Spain, (eGovernAbility-Access project TIN2014-52665-C2-2-R).
Doctoral stay TEAM - Technologies for information and communication, Europe - east
Asia Mobilities project (Erasmus Mundus Action 2-Strand 2 Programme) funded by the
European Commission realized in the University of Tokyo, Japan, for the Aizawa Laboratory in National Institute of Informatics (NII) for seven months.
Project:
Gobierno de España. TIN2017-87548-C2-1-R Gobierno de España. TIN2014-52665-C2-2-R
Keywords:
Deep learning algorithms
,
Machine learning methods
,
Information extraction
,
Neural networks
,
Biomedical named recognition
,
Biomedicine
Rights:
Atribución-NoComercial-SinDerivadas 3.0 España
Abstract:
The main hypothesis of this PhD dissertation is that novel Deep Learning algorithms can outperform classical Machine Learning methods for the task of Information Extraction in the Biomedical Domain. Contrary to classical systems, Deep Learning models can learnThe main hypothesis of this PhD dissertation is that novel Deep Learning algorithms can outperform classical Machine Learning methods for the task of Information Extraction in the Biomedical Domain. Contrary to classical systems, Deep Learning models can learn the representation of the data automatically without an expert domain knowledge and avoid the tedious and time-consuming task of defining relevant features.
A Drug-Drug Interaction (DDI), which is an essential subset of Adverse Drug Reaction (ADR),
represents the alterations in the effects of drugs that were taken simultaneously. The early
recognition of interacting drugs is a vital process that prevents serious health problems that can
cause death in the worst cases. Health-care professionals and researchers in this domain find the
task of discovering information about these incidents very challenging due to the vast number
of pharmacovigilance documents. For this reason, several shared tasks and datasets have been
developed in order to solve this issue with automated annotation systems with the capability
to extract this information. In the present document, the DDI corpus, which is an annotated
dataset of DDIs, is used with Deep Learning architectures without any external information
for the tasks of Name Entity Recognition and Relation Extraction in order to validate the
hypothesis. Furthermore, some other datasets are tested to evidence the performance of these
systems.
To sum up, the results suggest that the most common Deep Learning methods like Convolutional
Neural Networks and Recurrent Neural Networks overcome the traditional algorithms
concluding that Deep Learning is a real alternative for a specific and complex scenario like the
Information Extraction in the Biomedical domain. As a final goal, a complete architecture that
covers the two tasks is developed to structure the named entities and their relationships from
raw pharmacological texts.[+][-]