Publication:
Medical data wrangling with sequential variational autoencoders

dc.affiliation.dptoUC3M. Departamento de Teoría de la Señal y Comunicacioneses
dc.affiliation.grupoinvUC3M. Grupo de Investigación: Tratamiento de la Señal y Aprendizaje (GTSA)es
dc.contributor.authorBarrejón Moreno, Daniel
dc.contributor.authorMartínez Olmos, Pablo
dc.contributor.authorArtés Rodríguez, Antonio
dc.contributor.funderComunidad de Madrides
dc.contributor.funderEuropean Commissionen
dc.contributor.funderMinisterio de Ciencia e Innovación (España)es
dc.date.accessioned2022-06-06T14:30:45Z
dc.date.available2022-06-06T14:30:45Z
dc.date.issued2022-06
dc.description.abstractMedical data sets are usually corrupted by noise and missing data. These missing patterns are commonly assumed to be completely random, but in medical scenarios, the reality is that these patterns occur in bursts due to sensors that are off for some time or data collected in a misaligned uneven fashion, among other causes. This paper proposes to model medical data records with heterogeneous data types and bursty missing data using sequential variational autoencoders (VAEs). In particular, we propose a new methodology, the Shi-VAE, which extends the capabilities of VAEs to sequential streams of data with missing observations. We compare our model against state-of-theart solutions in an intensive care unit database (ICU) and a dataset of passive human monitoring. Furthermore, we find that standard error metrics such as RMSE are not conclusive enough to assess temporal models and include in our analysis the cross-correlation between the ground truth nd the imputed signal. We show that Shi-VAE achieves the best performance in terms of using both metrics, with lower computational complexity than the GP-VAE model, which is the state-of-the-art method for medical records.en
dc.description.sponsorshipThis work was supported in part by Spanish Government MCI under Grants TEC2017-92552-EXP and RTI2018-099655-B-100, in part by Comunidad de Madrid under Grants IND2017/TIC-7618, IND2018/TIC-9649, IND2020/TIC-17372, and Y2018/TCS-4705, in part by BBVA Foundation under the Deep-DARWiN Project, and in part by the European Union (FEDER) and the European Research Council (ERC) through the European Union's Horizon 2020 research and innovation program under Grant 714161.en
dc.format.extent9es
dc.identifier.bibliographicCitationIEEE Journal of biomedical and health informatics, 26(6), Jun. 2022, Pp. 2737-2745en
dc.identifier.doihttps://doi.org/10.1109/JBHI.2021.3123839
dc.identifier.issn2168-2194
dc.identifier.issn2168-2208 (online)
dc.identifier.publicationfirstpage2737es
dc.identifier.publicationissue6es
dc.identifier.publicationlastpage2745es
dc.identifier.publicationtitleIEEE Journal of Biomedical and Health Informaticsen
dc.identifier.publicationvolume26es
dc.identifier.urihttps://hdl.handle.net/10016/35008
dc.identifier.uxxiAR/0000028561
dc.language.isoengen
dc.publisherIEEEen
dc.relation.projectIDinfo:eu-repo/grantAgreement/H2020/714161/LOLITAes
dc.relation.projectIDGobierno de España. TEC2017-92552-EXPes
dc.relation.projectIDGobierno de España. RTI2018-099655-B-I00es
dc.relation.projectIDComunidad de Madrid. IND2017/TIC-7618es
dc.relation.projectIDComunidad de Madrid. IND2018/TIC-9649es
dc.relation.projectIDComunidad de Madrid. IND2020/TIC-17372es
dc.relation.projectIDComunidad de Madrid. Y2018/TCS-4705es
dc.rights© 2021 IEEE.en
dc.rights.accessRightsopen accessen
dc.subject.ecienciaBiología y Biomedicinaes
dc.subject.ecienciaTelecomunicacioneses
dc.subject.otherDeep learningen
dc.subject.otherVariational autoencodersen
dc.subject.otherMissing dataen
dc.subject.otherHeterogeneousen
dc.subject.otherSequential dataen
dc.titleMedical data wrangling with sequential variational autoencodersen
dc.typeresearch article*
dc.type.hasVersionAM*
dspace.entity.typePublication
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
medical_JBHI_2022_ps.pdf
Size:
750.49 KB
Format:
Adobe Portable Document Format