Why is llm.eval_logits what it is? #1850

Koloth · 2024-12-01T23:42:28Z

Koloth
Dec 1, 2024

I was recently trying to make a custom sampler (documented on reddit here) and I encountered what seems to be a seriously sharp edge in the high-level API that should at least be documented, if not changed. Long story short: using llm.eval_logits to get the next token logits versus llm.scores leads to an approximately 15x slowdown (17 seconds to get a completion versus over 300 seconds) due to llm.eval_logits calling to_list (code here). This was very surprising to me.

Is there a reason that eval_logits is implemented in such an inefficient way, and essentially duplicates scores? This took me far more debug time than I care to say to find!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why is llm.eval_logits what it is? #1850

{{title}}

Replies: 0 comments

Select a reply

Why is llm.eval_logits what it is? #1850

Koloth Dec 1, 2024

Replies: 0 comments

Koloth
Dec 1, 2024