Added EMA of running weights to the current codebase #853

WhenWen · 2025-01-09T06:40:18Z

The syntax works like
'''
--trainer.use_ema True
--trainer.ema_beta 0.995
'''

The current issue is that the current code isn't compatible with the previous version of trainer state.

dlwh · 2025-01-09T17:39:45Z

imho the ema stuff should go in the optimizer state, wdyt?

WhenWen · 2025-01-09T18:08:38Z

Sounds reasonable to me. This should improve compatibility as well. I will try to implement this.

WhenWen · 2025-01-12T00:32:42Z

After trying the implementation, implementing EMA within optimizers seems to limit the generality of this feature. Currently, I have to define specific optimizers like AdamWithEMA while this optimizer is intrinsically the same as Adam. I feel that perhaps it is a better idea to keep the EMA checkpoint in the trainer state because of this.

dlwh · 2025-01-14T21:13:20Z

i see what you're saying, but like, Optax doesn't really have an "adam" right? It's a bunch of transformations and this is just one more? Let me see

Kaiyue Wen added 3 commits January 8, 2025 22:35

ema

7d1f7de

ema

ee75877

ema with harness

35e34e0

WhenWen requested a review from dlwh January 9, 2025 06:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added EMA of running weights to the current codebase #853

Added EMA of running weights to the current codebase #853

WhenWen commented Jan 9, 2025

dlwh commented Jan 9, 2025

WhenWen commented Jan 9, 2025

WhenWen commented Jan 12, 2025

dlwh commented Jan 14, 2025

Added EMA of running weights to the current codebase #853

Are you sure you want to change the base?

Added EMA of running weights to the current codebase #853

Conversation

WhenWen commented Jan 9, 2025

dlwh commented Jan 9, 2025

WhenWen commented Jan 9, 2025

WhenWen commented Jan 12, 2025

dlwh commented Jan 14, 2025