eval can be by iteration number, not epoch #1

christ1ne · 2018-10-19T17:27:10Z

SSD evaluates on iterations, instead of epochs.

$ python mlp_compliance.py run.log.7
FAIL: Check has_eval_epoch failed on
:::MLPv0.5.0 ssd 1539850795.942373991 (train.py:134) eval_accuracy: {"iteration": 120000, "value": 0.1751619840501347}
FAILED: compliance errors.
$ grep -r has_eval_epoch .
./configs/v0.5.0_common.yaml: NAME: has_eval_epoch
./configs/v0.5.0_level1.yaml: NAME: has_eval_epoch

bitfort · 2018-10-19T20:22:16Z

I'll take a look at this and what it should print.

christ1ne · 2018-10-19T20:35:08Z

Please check https://github.com/mlperf/training/blob/ssd_logging_v2/single_stage_detector/ssd/train.py

The lr is adjusted at specific iteration numbers so there won’t be same number of lr prints as epoch prints.

bitfort · 2018-10-19T22:00:53Z

To clarify, is "iteration" counting batches or examples?

bitfort · 2018-10-19T22:09:25Z

I've been talking with some engineers. We're curious if we could round to the epoch number for this print, and then we print the exact iteration # in a separate tag? Do you think this could work?

christ1ne · 2018-10-22T17:25:12Z

@bitfort the following also failed since the evaluation does not start at epoch 0.
FAIL: Check each_eval_accuracy_has_0th_epoch failed.
FAIL: Check each_eval_start_has_0th_epoch failed.
FAIL: Check each_eval_stop_has_0th_epoch failed.

What's your suggestion on this?

bitfort · 2018-10-22T18:33:59Z

I added a thing to ignore SSD and resnet;

mlp_compliance/configs/v0.5.0_level1.yaml

Line 27 in 97105db

CODE: "v['epoch'] == 0 and ll.benchmark not in ['resnet', 'ssd']"

Can you try again on the newest version of this repo?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

eval can be by iteration number, not epoch #1

eval can be by iteration number, not epoch #1

christ1ne commented Oct 19, 2018

bitfort commented Oct 19, 2018

christ1ne commented Oct 19, 2018

bitfort commented Oct 19, 2018

bitfort commented Oct 19, 2018

christ1ne commented Oct 22, 2018 •

edited

Loading

bitfort commented Oct 22, 2018

eval can be by iteration number, not epoch #1

eval can be by iteration number, not epoch #1

Comments

christ1ne commented Oct 19, 2018

bitfort commented Oct 19, 2018

christ1ne commented Oct 19, 2018

bitfort commented Oct 19, 2018

bitfort commented Oct 19, 2018

christ1ne commented Oct 22, 2018 • edited Loading

bitfort commented Oct 22, 2018

christ1ne commented Oct 22, 2018 •

edited

Loading