Publication:
Application of information extraction techniques to pharmalogical domain: extracting drug-drug interaction

Loading...
Thumbnail Image
Identifiers
Publication date
2011-01
Defense date
Advisors
Tutors
Journal Title
Journal ISSN
Volume Title
Publisher
Sociedad EspaƱola para el Procesamiento del Lenguaje Natural
Impact
Google Scholar
Export
Research Projects
Organizational Units
Journal Issue
Abstract
A drug-drug interaction occurs when one drug influences the level or activity of another drug. The detection of drug interactions is an important research area in patient safety since these interactions can become very dangerous and increase health care costs. Although there are different databases supporting health care professionals in the detection of drug interactions, this kind of resource is rarely complete. Drug interactions are frequently reported in journals of clinical pharmacology, making medical literature the most effective source for the detection of drug interactions. However, the increasing volume of the literature overwhelms health care professionals trying to keep an up-to-date collection of all reported drugdrug interactions. The development of automatic methods for collecting, maintaining and interpreting this information is crucial for achieving a real improvement in their early detection. Information Extraction (IE) techniques can provide an interesting way of reducing the time spent by health care professionals on reviewing the literature. Nevertheless, no approach has been carried out to extract drug-drug interactions from biomedical texts. We have conducted a detailed study on various IE techniques applied to biomedical domain. Based on this study, we have proposed two different approximations for the extraction of drug-drug interactions from texts. The first approximation proposes a hybrid approach, which combines shallow parsing and pattern matching to extract relations between drugs from biomedical texts. The second approximation is based on a supervised machine learning approach, in particular, kernel methods. In addition, we have created and annotated the first corpus, DrugDDI, annotated with drug-drug interactions, which allow us to evaluate and compare both approximations. To the best of our knowledge, the DrugDDI corpus is the only available corpus annotated for drug-drug interactions and this research is the first work addressing the problem of extracting drug-drug interactions from biomedical texts. We believe the DrugDDI corpus is an important contribution because it could encourage other research groups to research into this problem. We have also defined three auxiliary processes to provide crucial information, which will be used by the aforementioned approximations. These auxiliary tasks are as follows: (1) a process for text analysis based on the UMLS MetaMap Transfer tool (MMTx) to provide shallow syntactic and semantic information from texts, (2) a process for drug name recognition and classification, and (3) a process for drug anaphora resolution. Finally, we have developed a pipeline prototype which integrates the different auxiliary processes. The pipeline architecture allows us to easily integrate these modules with each of the approaches proposed in this work: pattern-matching or kernels. Several experiments were performed on the DrugDDI corpus. They show that while the first approximation based on pattern matching achieves low performance, the approach based on kernel-methods achieves a performance comparable to those obtained by approaches which carry out a similar task such as the extraction of protein-protein interactions.
Description
Keywords
Bibliographic citation
Sociedad EspaƱola para el Procesamiento del Lenguaje Natural (SEPLN). Monografƭas 10 (2011) September
Collections