-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pull request for the checkpoint conversion of BERT(486) #761
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vulkomilev, I've submitted a quick review. Please take a look. I had a few questions/suggestions:
- Have you tried running this script for all presets? Does it work fine?
- The
check_output
function has not been called inmain()
. - Could you please run the formatter/linter? It seems like this script has not been formatted. You have to follow the steps listed here: https://github.com/keras-team/keras-nlp/blob/master/CONTRIBUTING.md#formatting-code.
return keras_nlp_output | ||
|
||
|
||
def main(_): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should call check_outputs()
in main.
return vocab_path, checkpoint_path, config_path, weights, model | ||
|
||
|
||
def convert_checkpoints(preset, weights, model): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The if...elif...else
logic still looks a bit complicated. Instead of using two sources - TF MG and original BERT repo, we're planning to just use the original BERT repo. Might make this whole block simpler. Let me do that for you!
@vulkomilev it looks like a lot of the comments above were resolved without being addressed. Did you mean to push a change to this branch? |
I still haven't pushed the changes will do in about an hour |
return vocab_path, checkpoint_path, config_path | ||
|
||
|
||
def convert_checkpoints(preset,checkpoint_path,config_dict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure to run our format script, this does not look like it has been black formatted in many place. (contributor guide has instructions)
model = keras_nlp.models.BertBackbone.from_preset("bert_tiny_en_uncased", | ||
load_weights=True) # keras_nlp.models.BertBase(vocabulary_size=VOCAB_SIZE) | ||
model.summary() | ||
if preset in ['bert_base_en_uncased', 'bert_base_en']: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should need all these if cases based on preset. Why do we need to add all these?
It's probably easiest to write this for checkpoints from a single source. The official BERT repo has all checkpoints we need. So you could model on https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/bert_tiny_uncased_en.ipynb, for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here we don't download from a fixed source f"""https://storage.googleapis.com/bert_models/2020_02_20/{MODEL_SUFFIX}_{MODEL_SPEC_STR}.zip"""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattdangerw - I will push some changes later today, will use ckpts from a single source
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abheesht17 any update on this?
FLAGS = flags.FLAGS | ||
|
||
PRESET_MAP = { | ||
"bert_base_cased": {'base':"roberta.base", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these big dicts actually used anywhere? They don't seem to be.
What you probably need is a mapping from our preset names to checkpoint files, and that's it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they are used in the configuration of the model . Where I can find the list of checkpoint files to map them?
@vulkomilev, do you mind if I push changes on top of yours? There are some modifications to make here, especially w.r.t. the single source BERT ckpts. |
yeah no problem |
Hi @abheesht17 I am encountering error with the conversion .What is the dimension of the last layer for bert_base_uncased because for hugging face it is 768 |
bump |
made a new release but I have problem with the shapes can some one check it ? |
@mattdangerw @ abheesht17 bump |
No description provided.