Publication:
Predicting of anaphylaxis in big data EMR by exploring machine learning approaches

dc.affiliation.dptoUC3M. Departamento de Informáticaes
dc.affiliation.grupoinvUC3M. Grupo de Investigación: Human Language and Accessibility Technologies (HULAT)en
dc.contributor.authorSegura-Bedmar, Isabel
dc.contributor.authorColón Ruiz, Cristóbal
dc.contributor.authorTejedor Alonso, Miguel Ángel
dc.contributor.authorMoro Moro, Mar
dc.contributor.funderMinisterio de Economía y Competitividad (España)es
dc.date.accessioned2024-01-15T08:51:18Z
dc.date.available2024-01-15T08:51:18Z
dc.date.issued2018-11-01
dc.description.abstractAnaphylaxis is a life-threatening allergic reaction that occurs suddenly after contact with an allergen. Epidemiological studies about anaphylaxis are very important in planning and evaluating new strategies that prevent this reaction, but also in providing a guide to the treatment of patients who have just suffered an anaphylactic reaction. Electronic Medical Records (EMR) are one of the most effective and richest sources for the epidemiology of anaphylaxis, because they provide a low-cost way of accessing rich longitudinal data on large populations. However, a negative aspect is that researchers have to manually review a huge amount of information, which is a very costly and highly time consuming task. Therefore, our goal is to explore different machine learning techniques to process Big Data EMR, lessening the needed efforts for performing epidemiological studies about anaphylaxis. In particular, we aim to study the incidence of anaphylaxis by the automatic classification of EMR. To do this, we employ the most widely used and efficient classifiers in text classification and compare different document representations, which range from well-known methods such as Bag Of Words (BoW) to more recent ones based on word embedding models, such as a simple average of word embeddings or a bag of centroids of word embeddings. Because the identification of anaphylaxis cases in EMR is a class-imbalanced problem (less than 1% describe anaphylaxis cases), we employ a novel undersampling technique based on clustering to balance our dataset. In addition to classical machine learning algorithms, we also use a Convolutional Neural Network (CNN) to classify our dataset.en
dc.description.sponsorshipThis work was supported by the Research Program of the Ministry of Economy and Competitiveness - Government of Spain (DeepEMR project TIN2017-87548-C2-1-R).en
dc.identifier.bibliographicCitationSegura Bedmar, I., Colon Ruiz, C., Tejedor Alonso, M.A, Moro Moro, M. (2018). Predicting of anaphylaxis in big data EMR by exploring machine learning approaches, Journal of Biomedical Informatics, 87, pp. 50-59.en
dc.identifier.doihttps://doi.org/10.1016/j.jbi.2018.09.012
dc.identifier.issn1532-0464
dc.identifier.publicationfirstpage50
dc.identifier.publicationlastpage59
dc.identifier.publicationtitleJournal of Biomedical Informaticsen
dc.identifier.publicationvolume87
dc.identifier.urihttps://hdl.handle.net/10016/39217
dc.identifier.uxxiAR/0000022190
dc.language.isoeng
dc.publisherElsevier
dc.relation.projectIDGobierno de España. TIN2017-87548-C2-1-Res
dc.rights© 2018 Elsevier Inc.es
dc.rightsAtribución-NoComercial-SinDerivadas 3.0 España
dc.rights.accessRightsopen accessen
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/*
dc.subject.ecienciaInformáticaes
dc.subject.otherdeep learningen
dc.subject.othertext classificationen
dc.subject.otherepidemiological studiesen
dc.subject.otheranaphylasise
dc.titlePredicting of anaphylaxis in big data EMR by exploring machine learning approachesen
dc.typeresearch articleen
dc.type.hasVersionAMen
dspace.entity.typePublication
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
predicting_JBI_2018_ps.pdf
Size:
539.51 KB
Format:
Adobe Portable Document Format
Description: