Seq2Seq: Special tokens are also added to targets for LL computation #149

samsontmr · 2023-03-02T18:25:19Z

Location: https://github.com/bigscience-workshop/lm-evaluation-harness/blob/master/lm_eval/models/huggingface.py#L460
@jon-tow I'm not sure if special tokens should be included as part of the target sequence when doing the LL computation.

jon-tow · 2023-03-02T20:57:50Z

Yeah, that's reasonable. What effect does skipping special (label) tokens have on the logits?

lm-evaluation-harness/lm_eval/models/huggingface.py

Line 529 in bdd1d3f

outputs = self._model_call(inputs=inputs_tokens, labels=targets_tokens)

Is there a way to identify special token positions so that we can strip them after the model calls and then only compare logits of non-special ones here:

lm-evaluation-harness/lm_eval/models/huggingface.py

Lines 540 to 547 in bdd1d3f

    
           log_softmax = log_softmax[:length] 
        
           target_tokens = target_tokens[:length] 
        
           greedy_tokens = log_softmax.argmax(dim=-1) 
        
           max_equal = (greedy_tokens == target_tokens).all() 
        
           target_logits = torch.gather( 
        
               log_softmax, 1, target_tokens.unsqueeze(-1) 
        
           ).squeeze(-1) 
        
           answer = (float(target_logits.sum()), bool(max_equal))

samsontmr · 2023-03-03T01:54:59Z

What effect does skipping special (label) tokens have on the logits?

It prevents the logits of the special tokens from being included in the computation since they might have different values when next to different tokens.

Seems like a better solution would be to have a separate encode function that never adds special tokens?

samsontmr assigned jon-tow Mar 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Seq2Seq: Special tokens are also added to targets for LL computation #149

Seq2Seq: Special tokens are also added to targets for LL computation #149

samsontmr commented Mar 2, 2023

jon-tow commented Mar 2, 2023

samsontmr commented Mar 3, 2023

Seq2Seq: Special tokens are also added to targets for LL computation #149

Seq2Seq: Special tokens are also added to targets for LL computation #149

Comments

samsontmr commented Mar 2, 2023

jon-tow commented Mar 2, 2023

samsontmr commented Mar 3, 2023