Training errors due batching of oracle_actions #1

mdelhoneux · 2020-05-25T13:56:00Z

There are errors when training lt_alknsis and fr_sequoia with a batch size of 8. I have not fully diagnosed the issue but I am quite confident that the problem comes from using the list of oracle actions from the metadata 'gold_actions' list instead of from a tensor as was done in the old code, see

koepsala-parser/modules/transition_parser_eud.py

Line 492 in 56ac985

oracle_actions = deepcopy([d['gold_actions'] for d in metadata])

This is because we pop from this list after an action is taken. In the majority of cases, this works fine, since we should only see a sentence once for every training iteration. However, there seems to be some weird things happening with allennlp where a sentence can be repeated in a batch (presumably to make sure the batch is full). This means that the second time that we see the sentence, the correct action that we should take has been popped from the list.
The easiest fix is probably to revert back to using tensors as gold actions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training errors due batching of oracle_actions #1

Training errors due batching of oracle_actions #1

mdelhoneux commented May 25, 2020

Training errors due batching of oracle_actions #1

Training errors due batching of oracle_actions #1

Comments

mdelhoneux commented May 25, 2020