Skip to content
This repository has been archived by the owner on Mar 23, 2023. It is now read-only.

Load ColossalAI GPT model as HuggingFace/Transformers Model #199

Open
Red-Giuliano opened this issue Nov 28, 2022 · 2 comments
Open

Load ColossalAI GPT model as HuggingFace/Transformers Model #199

Red-Giuliano opened this issue Nov 28, 2022 · 2 comments

Comments

@Red-Giuliano
Copy link

Red-Giuliano commented Nov 28, 2022

Describe the feature

Hi all,

I'm trying to use a GPT model I trained using ColossalAI with huggingface/transformers for inference but it's not possible to load the model as a huggingface model as it is implemented in pytorch. How can I go about loading the model I trained using huggingface/transformers library?

Thanks so much for your help.

Best,
Red

@feifeibear
Copy link
Contributor

Hi, can you tell me your method to use a GPT model I trained using ColossalAI with huggingface/transformers? Pointing out which example is your implementation reference would be helpful.

@Red-Giuliano
Copy link
Author

Hi feifeibear,

Thanks so much for your reply. The code I've used to train the model is adapted is from the /language/gpt/ example. I created a smaller version of the gpt2_vanilla configuration because my task did not require a model quite that large.

Now I have the model.pt file that I saved. When I try to load it using the transformers library I run into problems though (this makes sense since the GPT model is imported from the titans module, and not transformers). I'd love to use this model using the huggingface/transformers library so that I can take advantage of the functionality within that ecosystem.

From the research I've done it seems that the transformers library is expecting a model file with specific keys for each layer so I'm working on seeing if there is any way to resolve the discrepancy there. I know that the library is supported at some level because of this blog post:

https://medium.com/@yangyou_berkeley/colossal-ai-seamlessly-accelerates-large-models-at-low-costs-with-hugging-face-4d1a887e500d

But would love some more advice for my use case. Thanks so much once again for your time and help!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants