_forward_step_fn does not always return two values so eval.py breaks if is_pipe_parallel is false #1320

markNZed · 2024-11-12T13:35:43Z

This call to _forward_step_fn expects two values returned

gpt-neox/eval_tasks/eval_adapter.py

Line 372 in 59a5236

_, logits = self._forward_step_fn(model=self.model, data_iterator=inps)

The forward_step can return three values

gpt-neox/megatron/training.py

Line 847 in 59a5236

return loss, outputs, metrics

I guess I am seeing this because I have is_pipe_parallel false and that is uncommon. Maybe there needs to be an option not to return metrics.

markNZed · 2024-11-13T16:59:10Z

There are several "fixes" in https://github.com/markNZed/gpt-neox/tree/pipe_parallel_size_1 which might be related to this. I have not had the time to prepare PR but if someone who knows the code base just looks at the changes there I guess they will quickly see many easy to fix issues.

iPRET · 2024-11-14T13:31:49Z

Can confirm I've run into this issue multiple times aswell, even with pipe parallel size >1.

markNZed added the bug Something isn't working label Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

_forward_step_fn does not always return two values so eval.py breaks if is_pipe_parallel is false #1320

_forward_step_fn does not always return two values so eval.py breaks if is_pipe_parallel is false #1320

markNZed commented Nov 12, 2024

markNZed commented Nov 13, 2024

iPRET commented Nov 14, 2024

_forward_step_fn does not always return two values so eval.py breaks if is_pipe_parallel is false #1320

_forward_step_fn does not always return two values so eval.py breaks if is_pipe_parallel is false #1320

Comments

markNZed commented Nov 12, 2024

markNZed commented Nov 13, 2024

iPRET commented Nov 14, 2024