Publication:
Handling incomplete heterogeneous data using VAEs

Loading...
Thumbnail Image
Identifiers
Publication date
2020-11
Defense date
Advisors
Tutors
Journal Title
Journal ISSN
Volume Title
Publisher
Elsevier
Impact
Google Scholar
Export
Research Projects
Organizational Units
Journal Issue
Abstract
Variational autoencoders (VAEs), as well as other generative models, have been shown to be efficient and accurate for capturing the latent structure of vast amounts of complex high-dimensional data. However, existing VAEs can still not directly handle data that are heterogenous (mixed continuous and discrete) or incomplete (with missing data at random), which is indeed common in real-world applications. In this paper, we propose a general framework to design VAEs suitable for fitting incomplete heterogenous data. The proposed HI-VAE includes likelihood models for real-valued, positive real valued, interval, categorical, ordinal and count data, and allows accurate estimation (and potentially imputation) of missing data. Furthermore, HI-VAE presents competitive predictive performance in supervised tasks, outperforming supervised models when trained on incomplete data.
Description
Keywords
Generative models, Variational autoencoders, Incomplete heterogenous data
Bibliographic citation
Nazábal, A., Olmos, P. M., Ghahramani, Z. & Valera, I. (2020). Handling incomplete heterogeneous data using VAEs. Pattern Recognition, vol. 107, 107501.