You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We dropped some features which means not all checkpoint will be usable in Eole. For instance we dropped Copy attention.
Tensors have been renamed so using the script below may help but not cover 100% of cases if we change again some names.
configs are not handled in the same way. We released a script to help but again some manual adjustments may be needed.
Having said that bear in mind that:
OpenNMT-py checkpoints are .pt files carrying both the weights, vocab, optimizer and config.
In Eole we use Safetensors which means that weights are in a safetensor file and we have config.json and vocab.json to handle config and vocabularies.
If you know what you are doing then you can use the following to convert the weights into the safetensor file:
import torch
import safetensors
import re
from safetensors.torch import save_file
onmtckpt = torch.load("myonmtcheckpoint.pt")
newmodel = onmtckpt['model']
orig_keys = list(newmodel.keys())
for key in orig_keys:
newkey = key.replace('feed_forward.layer_norm', 'post_attention_layernorm')
newkey = newkey.replace('feed_forward.w_1', 'mlp.gate_up_proj')
newkey = newkey.replace('feed_forward.w_2', 'mlp.down_proj')
newkey = newkey.replace('feed_forward.w_3', 'mlp.up_proj')
newkey = newkey.replace('layer_norm_1', 'input_layernorm')
newkey = newkey.replace('layer_norm_2', 'precontext_layernorm')
newkey = newkey.replace('encoder.transformer', 'encoder.transformer_layers')
newkey = re.sub(r'(encoder\.transformer_layers\.\d+)\.layer_norm', r'\1.input_layernorm', newkey)
newmodel[newkey] = newmodel.pop(key)
newmodel['src_emb.embeddings.weight'] = newmodel.pop('encoder.embeddings.make_embedding.emb_luts.0.weight')
a = newmodel.pop('decoder.embeddings.make_embedding.emb_luts.0.weight')
newmodel['generator.bias'] = onmtckpt['generator']['bias']
# if vocabs are not shared then you need to do the same for tgt vocab
save_file(newmodel, "model.00.safetensors")
For the config file, the easiest is to generate one with a fake training and adjust the json keys/values.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
This is not straight forward.
Having said that bear in mind that:
OpenNMT-py checkpoints are .pt files carrying both the weights, vocab, optimizer and config.
In Eole we use Safetensors which means that weights are in a safetensor file and we have config.json and vocab.json to handle config and vocabularies.
If you know what you are doing then you can use the following to convert the weights into the safetensor file:
For the config file, the easiest is to generate one with a fake training and adjust the json keys/values.
Enjoy.
Beta Was this translation helpful? Give feedback.
All reactions