Publication: Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts
dc.affiliation.dpto | UC3M. Departamento de Informática | es |
dc.affiliation.grupoinv | UC3M. Grupo de Investigación: Human Language and Accessibility Technologies (HULAT) | es |
dc.contributor.author | Segura Bedmar, Isabel | |
dc.contributor.author | Camino Perdones, David | |
dc.contributor.author | Guerrero Aspizua, Sara | |
dc.contributor.funder | Comunidad de Madrid | es |
dc.contributor.funder | Ministerio de Ciencia e Innovación (España) | es |
dc.date.accessioned | 2023-12-20T15:50:11Z | |
dc.date.available | 2023-12-20T15:50:11Z | |
dc.date.issued | 2022-07-06 | |
dc.description.abstract | Background and objective: Although rare diseases are characterized by low prevalence, approximately 400 million people are affected by a rare disease. The early and accurate diagnosis of these conditions is a major challenge for general practitioners, who do not have enough knowledge to identify them. In addition to this, rare diseases usually show a wide variety of manifestations, which might make the diagnosis even more difficult. A delayed diagnosis can negatively affect the patient"s life. Therefore, there is an urgent need to increase the scientific and medical knowledge about rare diseases. Natural Language Processing (NLP) and Deep Learning can help to extract relevant information about rare diseases to facilitate their diagnosis and treatments. Methods: The paper explores several deep learning techniques such as Bidirectional Long Short Term Memory (BiLSTM) networks or deep contextualized word representations based on Bidirectional Encoder Representations from Transformers (BERT) to recognize rare diseases and their clinical manifestations (signs and symptoms). Results: BioBERT, a domain-specific language representation based on BERT and trained on biomedical corpora, obtains the best results with an F1 of 85.2% for rare diseases. Since many signs are usually described by complex noun phrases that involve the use of use of overlapped, nested and discontinuous entities, the model provides lower results with an F1 of 57.2%. Conclusions: While our results are promising, there is still much room for improvement, especially with respect to the identification of clinical manifestations (signs and symptoms). | en |
dc.description.sponsorship | Funding: This work is part of the R &D &i ACCESS2MEET project (PID2020-116527RB-I0), financed by MCIN AEI/10.13039/501100011033/. This work was also supported by the Community of Madrid under the Interdisciplinary Projects Program for Young Researchers (NLP4Rare-CM-UC3M project) and the line of Excellence of University Professors (EPUC3M17). | en |
dc.format.extent | 23 | es |
dc.identifier.bibliographicCitation | Segura-Bedmar, I., Camino-Perdonas, D., & Guerrero-Aspizúa, S. (2022). Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts. BMC Bioinformatics, 23(1). | en |
dc.identifier.doi | https://doi.org/10.1186/s12859-022-04810-y | |
dc.identifier.issn | 1471-2105 | |
dc.identifier.publicationfirstpage | 1 | es |
dc.identifier.publicationissue | 263 | es |
dc.identifier.publicationlastpage | 23 | es |
dc.identifier.publicationtitle | BMC BIOINFORMATICS | en |
dc.identifier.publicationvolume | 23 | es |
dc.identifier.uri | https://hdl.handle.net/10016/39125 | |
dc.identifier.uxxi | AR/0000032111 | |
dc.language.iso | eng | es |
dc.publisher | Springer Nature | en |
dc.relation.dataset | https://doi.org/10.21950/S2IRKE | |
dc.relation.dataset | https://doi.org/10.21950/DEURZF | |
dc.relation.projectID | Gobierno de España. PID2020-116527RB-I00 | es |
dc.relation.projectID | Comunidad de Madrid. NLP4Rare-CM-UC3M | es |
dc.rights | © The Author(s) 2022 | en |
dc.rights | Atribución 3.0 España | * |
dc.rights.accessRights | open access | en |
dc.rights.uri | http://creativecommons.org/licenses/by/3.0/es/ | * |
dc.subject.eciencia | BiologÃa y Biomedicina | es |
dc.subject.other | Rare diseases | en |
dc.subject.other | Named entity recognition | en |
dc.subject.other | Deep learning | en |
dc.title | Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts | en |
dc.type | research article | en |
dspace.entity.type | Publication |
Files
Original bundle
1 - 1 of 1