| | |
| | |
Stat |
Members: 3665 Articles: 2'599'751 Articles rated: 2609
25 January 2025 |
|
| | | |
|
Article overview
| |
|
Learning multi-modal generative models with permutation-invariant encoders and tighter variational bounds | Marcel Hirt
; Domenico Campolo
; Victoria Leong
; Juan-Pablo Ortega
; | Date: |
1 Sep 2023 | Abstract: | Devising deep latent variable models for multi-modal data has been a
long-standing theme in machine learning research. Multi-modal Variational
Autoencoders (VAEs) have been a popular generative model class that learns
latent representations which jointly explain multiple modalities. Various
objective functions for such models have been suggested, often motivated as
lower bounds on the multi-modal data log-likelihood or from
information-theoretic considerations. In order to encode latent variables from
different modality subsets, Product-of-Experts (PoE) or Mixture-of-Experts
(MoE) aggregation schemes have been routinely used and shown to yield different
trade-offs, for instance, regarding their generative quality or consistency
across multiple modalities. In this work, we consider a variational bound that
can tightly lower bound the data log-likelihood. We develop more flexible
aggregation schemes that generalise PoE or MoE approaches by combining encoded
features from different modalities based on permutation-invariant neural
networks. Our numerical experiments illustrate trade-offs for multi-modal
variational bounds and various aggregation schemes. We show that tighter
variational bounds and more flexible aggregation models can become beneficial
when one wants to approximate the true joint distribution over observed
modalities and latent variables in identifiable models. | Source: | arXiv, 2309.00380 | Services: | Forum | Review | PDF | Favorites |
|
|
No review found.
Did you like this article?
Note: answers to reviews or questions about the article must be posted in the forum section.
Authors are not allowed to review their own article. They can use the forum section.
|
| |
|
|
|