Converting a OpenNMT-py checkpoint to Eole #76

vince62s · 2024-07-18T08:58:17Z

vince62s
Jul 18, 2024
Maintainer

This is not straight forward.

We dropped some features which means not all checkpoint will be usable in Eole. For instance we dropped Copy attention.
Tensors have been renamed so using the script below may help but not cover 100% of cases if we change again some names.
configs are not handled in the same way. We released a script to help but again some manual adjustments may be needed.

Having said that bear in mind that:

OpenNMT-py checkpoints are .pt files carrying both the weights, vocab, optimizer and config.
In Eole we use Safetensors which means that weights are in a safetensor file and we have config.json and vocab.json to handle config and vocabularies.

If you know what you are doing then you can use the following to convert the weights into the safetensor file:

import torch
import safetensors
import re
from safetensors.torch import save_file

onmtckpt = torch.load("myonmtcheckpoint.pt")
newmodel = onmtckpt['model']
orig_keys = list(newmodel.keys())

for key in orig_keys:
     newkey = key.replace('feed_forward.layer_norm', 'post_attention_layernorm')
     newkey = newkey.replace('feed_forward.w_1', 'mlp.gate_up_proj')
     newkey = newkey.replace('feed_forward.w_2', 'mlp.down_proj')
     newkey = newkey.replace('feed_forward.w_3', 'mlp.up_proj')
     newkey = newkey.replace('layer_norm_1', 'input_layernorm')
     newkey = newkey.replace('layer_norm_2', 'precontext_layernorm')
     newkey = newkey.replace('encoder.transformer', 'encoder.transformer_layers')
     newkey = re.sub(r'(encoder\.transformer_layers\.\d+)\.layer_norm', r'\1.input_layernorm', newkey)
     newmodel[newkey] = newmodel.pop(key)

newmodel['src_emb.embeddings.weight'] = newmodel.pop('encoder.embeddings.make_embedding.emb_luts.0.weight')
a = newmodel.pop('decoder.embeddings.make_embedding.emb_luts.0.weight')
newmodel['generator.bias'] = onmtckpt['generator']['bias']

# if vocabs are not shared then you need to do the same for tgt vocab

save_file(newmodel, "model.00.safetensors")

For the config file, the easiest is to generate one with a fake training and adjust the json keys/values.

Enjoy.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Converting a OpenNMT-py checkpoint to Eole #76

{{title}}

Replies: 0 comments

Select a reply

Converting a OpenNMT-py checkpoint to Eole #76

vince62s Jul 18, 2024 Maintainer

Replies: 0 comments

vince62s
Jul 18, 2024
Maintainer