Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create interface for accessing the training code #12

Merged
merged 2 commits into from
Jun 17, 2024

Conversation

RobotSail
Copy link
Member

@RobotSail RobotSail commented Jun 11, 2024

This PR turns this into a library.

View the following PR for an example: instructlab/instructlab#1329

The main idea is that we provide a function like:

def train_torchrun(torchargs: TorchrunArguments, training_args: FullTrainingArguments):
  pass

Each class TorchrunArguments and FullTrainingArguments provides training-specific arguments. And then from another library, you would simply provide these arguments to the train_torchrun function.

For any other training method that we define, we could provide a similar interface depending on which arguments are needed.

We separate the arguments here because TorchrunArguments are the ones passed to torchrun and then the full training arguments are the ones that we actually train with. It's not crucial that these are different, but it makes our lives a lot easier from a maintenance standpoint.

setup.py Outdated Show resolved Hide resolved
ilab_train/config.py Outdated Show resolved Hide resolved
@RobotSail
Copy link
Member Author

instructlab/eval#1

Copy link
Member

@russellb russellb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't reviewed the code part, just pyproject/requirements/namespace package portions. That all looks good, though I gave some suggested edits for some minor cleanup. That includes dropping Python 3.9 since I think we landed on 3.10 as the minimum.

pyproject.toml Outdated Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
requirements.txt Outdated Show resolved Hide resolved
Copy link
Member

@nathan-weinberg nathan-weinberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just focused on the Python packaging stuff, will leave the training code itself to y'all for now 😄

pyproject.toml Outdated Show resolved Hide resolved
requirements.txt Outdated Show resolved Hide resolved
@RobotSail RobotSail force-pushed the os-add-interface branch 2 times, most recently from a0a812a to 4296d0d Compare June 13, 2024 15:32
@RobotSail RobotSail force-pushed the os-add-interface branch 2 times, most recently from a72396a to 493e755 Compare June 14, 2024 15:03
@Maxusmusti
Copy link
Contributor

Note: Current requirements will overwrite existing nvidia pytorch installs. Need to ensure that if those exist, we are not installing our own torch.

Copy link
Contributor

@JamesKunstle JamesKunstle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two thoughts, one being a minor nit-pick, but in general this is very good, I'll check that it runs.

README.md Outdated Show resolved Hide resolved
src/instructlab/training/config.py Show resolved Hide resolved
@RobotSail
Copy link
Member Author

Thank you for the review @JamesKunstle, I've created an issue about your quantization comment here: #29

Copy link
Contributor

@cdoern cdoern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initial comments. Want to make sure this is aligned with CLI impl.

src/instructlab/training/main_ds.py Show resolved Hide resolved
src/instructlab/training/main_ds.py Show resolved Hide resolved
src/instructlab/training/main_ds.py Show resolved Hide resolved
Copy link
Member

@russellb russellb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you squash the commit history here before merging?

Copy link
Contributor

@JamesKunstle JamesKunstle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v good thank you for all your work

Copy link
Member

@russellb russellb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

biggestion question here is about the use of class variables.

In terms of PR structure -- it would have been nice to have the PR that's doing the library setup setup from all the API introduction stuff.

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
requirements.txt Show resolved Hide resolved
src/instructlab/training/config.py Show resolved Hide resolved
Copy link
Contributor

@Maxusmusti Maxusmusti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, tested full training (no lora/offload) with both base scripts and updated interface, both functioning as expected

@Maxusmusti Maxusmusti requested a review from russellb June 17, 2024 18:20
setup.py Outdated Show resolved Hide resolved
.gitignore Show resolved Hide resolved
@russellb
Copy link
Member

just to be super clear:

  1. squash the first two commits and fix up the commit message
  2. create a follow-up issue for cleaning up requirements.txt to specify versions for all dependencies + making sure the versions are aligned with instructlab/instructlab if it's a shared dependency

RobotSail and others added 2 commits June 17, 2024 15:31
Signed-off-by: Oleg S <[email protected]>

turn the training code into a library that can be invoked by another package

Signed-off-by: Oleg S <[email protected]>

update interface

Signed-off-by: Oleg S <[email protected]>

rename ilab_train -> train

Signed-off-by: Oleg S <[email protected]>

remove setup.py in favor of pyproject.toml

Signed-off-by: Oleg S <[email protected]>
@RobotSail RobotSail merged commit 2a700b1 into instructlab:main Jun 17, 2024
2 checks passed
@russellb
Copy link
Member

  1. create a follow-up issue for cleaning up requirements.txt to specify versions for all dependencies + making sure the versions are aligned with instructlab/instructlab if it's a shared dependency

We talked about this, so I know your intent was to create this issue, but I can't find it anywhere. I went ahead and filed this one: #34

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants