switch from `training_args.bin` `training_args.json` #35010

not-lain · 2024-11-29T03:14:06Z

What does this PR do?

switch from training_args.bin to training_args.json and only capture the parameters that the user passed
I'm using the same approach we are using in huggingface_hub's PyTorchModelHubMixin to store as little parameters as possible.
a minimalistic approach to test this is pr

from transformers import TrainingArguments
args = TrainingArguments(output_dir="folder",eval_strategy="no") # or any other paramters
print(args.to_json_string())
# outputs
"""
{
  "output_dir": "folder",
  "eval_strategy": "no",
  "logging_dir": "folder\\runs\\Nov29_02-44-45_iphone-laptop"
}
"""
# logging_dir is a special parameter that is always captured and added to the training_args because we want to ensure consistency

# stores the parameters into a file
args.to_json_file("training_args.json")

# loads an instance using the class directly
args2  = TrainingArguments.from_json_file("training_args.json")

using this approach, we ensure that we only store the parameters that the user-defined manually and not the ones that have default values or the ones inferred from the system (ie cpu, cuda, tpu ... ), leaving some room for flexibility.

in a sense the parameters are mutable, meaning the user can physically alter them.

Fixes #34612

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@muellerzr @SunMarc

not-lain added 4 commits November 29, 2024 01:45

capture init parameters in training_args

a6484c0

update relevant attributes

780d3e7

attribute calling for training args

66c655b

add class attribute to load the training_args from a local file

6d3a186

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

switch from `training_args.bin` `training_args.json` #35010

switch from `training_args.bin` `training_args.json` #35010

not-lain commented Nov 29, 2024

switch from training_args.bin training_args.json #35010

Are you sure you want to change the base?

switch from training_args.bin training_args.json #35010

Conversation

not-lain commented Nov 29, 2024

What does this PR do?

Before submitting

Who can review?

switch from `training_args.bin` `training_args.json` #35010

switch from `training_args.bin` `training_args.json` #35010