Clarification on obtaining the embedding related to the <POSE> token #2

AndrejHafner · 2023-12-26T20:05:03Z

Hello! First of all, thank you for the great article. I have a question about how you obtain the embedding related to the token, which is then projected and used for human pose reconstruction. If I understand correctly, when the model outputs a token, you take the logits from the last layer of the LLM (on which softmax was applied and from the resulting distribution the token was sampled) and use those as embeddings?

JJJYmmm · 2024-01-24T09:10:23Z

I think it's the last-layer embedding(hidden_states, before logits) corresponding to the <POSE> token. You can reference LISA https://github.com/dvlab-research/LISA.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarification on obtaining the embedding related to the <POSE> token #2

Clarification on obtaining the embedding related to the <POSE> token #2

AndrejHafner commented Dec 26, 2023

JJJYmmm commented Jan 24, 2024 •

edited

Loading

Clarification on obtaining the embedding related to the <POSE> token #2

Clarification on obtaining the embedding related to the <POSE> token #2

Comments

AndrejHafner commented Dec 26, 2023

JJJYmmm commented Jan 24, 2024 • edited Loading

JJJYmmm commented Jan 24, 2024 •

edited

Loading