Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

T-SNE visualization of skill-space z-embeddings #17

Open
junh0cho opened this issue Nov 27, 2024 · 0 comments
Open

T-SNE visualization of skill-space z-embeddings #17

junh0cho opened this issue Nov 27, 2024 · 0 comments

Comments

@junh0cho
Copy link

Hi, I am trying to visualize skill space as well as your figure 5.
QueST's skill-prior transformer predicts 8 tokens for every 8 steps of rollouts (which is action horizon).
I guess these 8 token indices are from skill codebook (256-dimensional), not skillGPT embedding (384-dimensional)

** rollouts 1 of KITCHEN_SCENE10_close_the_top_drawer_of_the_cabinet starting
 
** Step: 0 : tensor([[746, 639, 631, 614, 644, 459, 506, 466]], device='cuda:0')
** Step: 8 : tensor([[907, 639, 638, 812, 851, 659, 507, 467]], device='cuda:0')
** Step: 16 : tensor([[909, 835, 835, 827, 819, 706, 587, 588]], device='cuda:0')
** Step: 24 : tensor([[919, 833, 834, 826, 939, 980, 780, 388]], device='cuda:0')
** Step: 32 : tensor([[799, 824, 865, 948, 781, 781, 581, 388]], device='cuda:0')
** Step: 40 : tensor([[799, 906, 948, 781, 773, 773, 572, 396]], device='cuda:0')
** Step: 48 : tensor([[798, 906, 980, 774, 774, 573, 373, 148]], device='cuda:0')
** Step: 56 : tensor([[790, 779, 980, 774, 773, 181, 108,  68]], device='cuda:0')
** Step: 64 : tensor([[790, 979, 780, 773, 132, 109, 156, 108]], device='cuda:0')

Above is example of printed token indices.
Interestingly, each step seemingly share some tokens for same location (eg. 639, 388, 799, 980, 790, ...)

From my understanding, these 8 skill codes are not causal as they are cross-attended by action decoder, not as autoregressive inputs.
Thus, for now, I have avg-pooled 8 vectors and visualized for each 8 steps, but would like to ask you for details.

How do I visaulize 256-dimensional 8 vectors for each 8 steps to get the similar figure of yours?

  1. Is each dot of figure 5 corresponds to each 8 step of rolling out an episode?
  2. What embedding is plotted by t-sne? Skill prior transformer embedding for each token VS Skill codebook from autoencoder

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant