-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Logging a finetuned Transformer model with the mlflow.transformers flavor results in '...ForSequenceClassification' object has no attribute 'model' #424
Comments
Hi @hugocool, I am sorry that you are facing issues. Thank you for the very detailed bug report. The error comes from mlflow itself and you should likely raise the issue there if we cannot fix it. From your example: from transformers import BertForSequenceClassification, Trainer, TrainingArguments
model = BertForSequenceClassification.from_pretrained("bert-large-uncased")
training_args = TrainingArguments(
output_dir='./results', # output directory
num_train_epochs=3, # total # of training epochs
per_device_train_batch_size=16, # batch size per device during training
per_device_eval_batch_size=64, # batch size for evaluation
warmup_steps=500, # number of warmup steps for learning rate scheduler
weight_decay=0.01, # strength of weight decay
logging_dir='./logs', # directory for storing logs
)
trainer = Trainer(
model=model, # the instantiated 🤗 Transformers model to be trained
args=training_args, # training arguments, defined above
train_dataset=train_dataset, # training dataset
eval_dataset=test_dataset # evaluation dataset
)
trainer.train()
model = trainer.model
import mlflow.transformers save_model
mlflow.transformers.save_model(model, "model/") # this should raise the same error According to mlflow documentation, it seems that the "model" should be
So this should likely work: |
That does not work, if i set the return of my training node to `dict(model = trainer.model)'
What does work is to package as a transforms |
Indeed a pipeline is supposed to work 👍 Looking at the mlflow doc, it seems that you need to specify an additional If it ok for you, I'close the issue! |
okay, yeah, maybe it would be great for other to have an examples section in kedro-mlflow for the different models. We could add the transformers to there as well. One thing, i would like to add, which was also mentioned in the Kedro slack, is the way the MetricsDataset works. I would expect to be able to just log a dict directly, however instead i need to make separate MetricDataSet for each metric, and return the metrics as a list. |
I close the issue because there is an other one opened about I will accept documentation PR but it is not possible for me to document all these specific behaviours which are pure mlflow and much better documented over their documentation. |
Logging a finetuned Transformer model with the mlflow.transformers flavor results in '...ForSequenceClassification' object has no attribute 'model'
Context
I am training a transformer model for a downstream task, and would like to save the model in mlflow using the mlflow.transformers flavor. However, when actually logging the trained model, for example in jupyter with the catalog.save function, i get an error. While this works fine when using the pickledataset.
So what is the correct way to log transformers using the mlflow.transformers flavor?
Steps to Reproduce
from trainings example:
and now where the error happens:
catalog.save('finetuned_classifier',model)
Expected Result
Actual Result
Your Environment
Include as many relevant details about the environment in which you experienced the bug:
kedro
andkedro-mlflow
version used (pip show kedro
andpip show kedro-mlflow
):python -V
):python 3.9.10
amazon linux SUSE
The text was updated successfully, but these errors were encountered: