Question on Feature & Unshifted token in Experiments #159

kimjoohyungsd · 2024-11-12T11:33:51Z

I am curious how did you guys implemented this experiments. I mean given Figure 6 in eagle 1 as an example, Feautre & unshifted token can be concatenated for tokens generated by Large target model. However, For tokens generated by draft models, How can they get features without running models in advance?

hongyanz · 2024-11-17T17:39:13Z

It is thus the feature&shifted token. The feature predicted by the draft model goes through an LM head to get a distribution and we can sample the next token from this distribution. In the next round, we concatenate this feature with this sampled token for the next generation. Figure 6 gives a clear description.

haiduo · 2024-12-24T07:06:25Z

Hi @hongyanz @yanjunplay ,
Then may I ask, how is the feature & unshifted-token scheme implemented? From Figure 8 in Eagle1, it seems that feature & shifted-token achieves significant improvements compared to feature & unshifted-token. Additionally, are the differences between feature & shifted-token and feature & unshifted-token reflected during the inference phase or the training phase?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question on Feature & Unshifted token in Experiments #159

Question on Feature & Unshifted token in Experiments #159

kimjoohyungsd commented Nov 12, 2024

hongyanz commented Nov 17, 2024

haiduo commented Dec 24, 2024

Question on Feature & Unshifted token in Experiments #159

Question on Feature & Unshifted token in Experiments #159

Comments

kimjoohyungsd commented Nov 12, 2024

hongyanz commented Nov 17, 2024

haiduo commented Dec 24, 2024