Publication: Application of information extraction techniques to pharmalogical domain: extracting drug-drug interaction
Loading...
Identifiers
Publication date
2011-01
Defense date
Authors
Advisors
Tutors
Journal Title
Journal ISSN
Volume Title
Publisher
Sociedad Española para el Procesamiento del Lenguaje Natural
Abstract
A drug-drug interaction occurs when one drug influences the level or activity of another drug. The
detection of drug interactions is an important research area in patient safety since these interactions can
become very dangerous and increase health care costs. Although there are different databases supporting
health care professionals in the detection of drug interactions, this kind of resource is rarely complete.
Drug interactions are frequently reported in journals of clinical pharmacology, making medical literature
the most effective source for the detection of drug interactions. However, the increasing volume of the
literature overwhelms health care professionals trying to keep an up-to-date collection of all reported drugdrug
interactions. The development of automatic methods for collecting, maintaining and interpreting this
information is crucial for achieving a real improvement in their early detection. Information Extraction
(IE) techniques can provide an interesting way of reducing the time spent by health care professionals on
reviewing the literature. Nevertheless, no approach has been carried out to extract drug-drug interactions
from biomedical texts.
We have conducted a detailed study on various IE techniques applied to biomedical domain. Based
on this study, we have proposed two different approximations for the extraction of drug-drug interactions
from texts. The first approximation proposes a hybrid approach, which combines shallow parsing and
pattern matching to extract relations between drugs from biomedical texts. The second approximation
is based on a supervised machine learning approach, in particular, kernel methods. In addition, we have
created and annotated the first corpus, DrugDDI, annotated with drug-drug interactions, which allow us
to evaluate and compare both approximations. To the best of our knowledge, the DrugDDI corpus is the
only available corpus annotated for drug-drug interactions and this research is the first work addressing
the problem of extracting drug-drug interactions from biomedical texts. We believe the DrugDDI corpus
is an important contribution because it could encourage other research groups to research into this
problem. We have also defined three auxiliary processes to provide crucial information, which will be
used by the aforementioned approximations. These auxiliary tasks are as follows: (1) a process for text
analysis based on the UMLS MetaMap Transfer tool (MMTx) to provide shallow syntactic and semantic
information from texts, (2) a process for drug name recognition and classification, and (3) a process for
drug anaphora resolution. Finally, we have developed a pipeline prototype which integrates the different
auxiliary processes. The pipeline architecture allows us to easily integrate these modules with each of
the approaches proposed in this work: pattern-matching or kernels. Several experiments were performed
on the DrugDDI corpus. They show that while the first approximation based on pattern matching
achieves low performance, the approach based on kernel-methods achieves a performance comparable to
those obtained by approaches which carry out a similar task such as the extraction of protein-protein
interactions.
Description
Keywords
Bibliographic citation
Sociedad Española para el Procesamiento del Lenguaje Natural (SEPLN). Monografías 10 (2011) September