You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 28, 2024. It is now read-only.
Hi!
I'm not sure you still do Q&A support here 😊, but I'm obsessed a certain problem beyond my math skills. I hope you could help me.
The question is related to the loss function of your RSSM which uses variational approach. The reconstruction loss of VAE is p(o_t|s_t) as it is decoder from latent to image. In this case, an observation(=image) has much bigger dimension than the latent. But when it comes to the case in which o_t has much smaller dimension (for example, 4 values like cartpole of OpenAI gym classic_control) than the latent(let's say this is 32~64 here), I think p(o_t|s_t) could not learn any meaningful distribution. Because the conditional s_t was sampled from variational posterior q(s_t|a_1:t, o_1:t) which already has seen the observation of current timestep o_t, I suspect that s_t could just learn to copy the full o_t inside s_t because the dimension of s_t is much bigger.
In this situation (non-image and small dimension of observation), can we still hold this VAE-like approach?
Or is there some other technique more reasonable in this case?
I hope this worry makes sense to you. 😕
The text was updated successfully, but these errors were encountered:
seheevic
changed the title
What if the observation is extracted features instead of images?
What if the observation is extracted features(smaller dimension than latent) instead of images?
Feb 6, 2020
seheevic
changed the title
What if the observation is extracted features(smaller dimension than latent) instead of images?
What if the observation is extracted features instead of images and has much smaller dimension than latent?
Feb 6, 2020
Since the Autoencoder is used for dimensionality reduction (in the default configs from 64x64x3=12288 dimensions down to around 500 dimensions), I would not apply it in the scenario you describe. If you have a low-dimensional input, you may skip the autoencoder, since it wouldn't give you any gain. I assume you can still learn the latent dynamics model, the reward model and then apply MPC, just like planet would do if you scrap the VAE.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Hi!
I'm not sure you still do Q&A support here 😊, but I'm obsessed a certain problem beyond my math skills. I hope you could help me.
The question is related to the loss function of your RSSM which uses variational approach. The reconstruction loss of VAE is
p(o_t|s_t)
as it is decoder from latent to image. In this case, an observation(=image) has much bigger dimension than the latent. But when it comes to the case in which o_t has much smaller dimension (for example, 4 values like cartpole of OpenAI gym classic_control) than the latent(let's say this is 32~64 here), I thinkp(o_t|s_t)
could not learn any meaningful distribution. Because the conditional s_t was sampled from variational posteriorq(s_t|a_1:t, o_1:t)
which already has seen the observation of current timestep o_t, I suspect that s_t could just learn to copy the full o_t inside s_t because the dimension of s_t is much bigger.In this situation (non-image and small dimension of observation), can we still hold this VAE-like approach?
Or is there some other technique more reasonable in this case?
I hope this worry makes sense to you. 😕
The text was updated successfully, but these errors were encountered: