Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于OAG的继续预训练 #3

Open
RealDuxy opened this issue Jan 4, 2022 · 3 comments
Open

关于OAG的继续预训练 #3

RealDuxy opened this issue Jan 4, 2022 · 3 comments

Comments

@RealDuxy
Copy link

RealDuxy commented Jan 4, 2022

您好,我正在尝试在其他领域利用OAG-BERT的策略进行预训练实现效果,请问你们现在是否开源了OAG-BERT的预训练代码呢?我好像没有在源码中找到。谢谢

@Somefive
Copy link
Collaborator

Currently, we do not release the pre-training code (and as far as I know the pre-training part will not be released). Sorry for that. But you can refer to the Roberta training code if you want to reproduce the training of OAG-BERT. BTW, if you just want to continue training on OAG-BERT, you do not necessarily need to the exact same code as the pre-training part, just following any other BERT model fine-tuning code will be fine I think.

@DavidLeexxxx
Copy link

Hi, thanks for your excellent work! I am still focusing on this fine-tuning issue, since the OAG-BERT is currently based on CogDL ecosystem which i am not quite familiar with. The document of OAG-BERT did not show the fine-tuning logic but only some APIs. So may i ask for updating or providing some coding details or demos about how to fine-tuning on this wonderful work.

@Somefive
Copy link
Collaborator

Somefive commented Feb 3, 2023

Hi, thanks for your excellent work! I am still focusing on this fine-tuning issue, since the OAG-BERT is currently based on CogDL ecosystem which i am not quite familiar with. The document of OAG-BERT did not show the fine-tuning logic but only some APIs. So may i ask for updating or providing some coding details or demos about how to fine-tuning on this wonderful work.

The pre-training code and fine-tuning code we do not plan to release it because it is not ready for general use. We were using DeepSpeed training framework and had lots of dirty code (including debugging and logging) for it so it is a bit far from being released. If you are interested in it, I would recommend you to try the Microsoft DeepSpeed framework, or use the general transformer training script. The mechanism is the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants