Create interface for accessing the training code #12

RobotSail · 2024-06-11T18:38:49Z

This PR turns this into a library.

View the following PR for an example: instructlab/instructlab#1329

The main idea is that we provide a function like:

def train_torchrun(torchargs: TorchrunArguments, training_args: FullTrainingArguments):
  pass

Each class TorchrunArguments and FullTrainingArguments provides training-specific arguments. And then from another library, you would simply provide these arguments to the train_torchrun function.

For any other training method that we define, we could provide a similar interface depending on which arguments are needed.

We separate the arguments here because TorchrunArguments are the ones passed to torchrun and then the full training arguments are the ones that we actually train with. It's not crucial that these are different, but it makes our lives a lot easier from a maintenance standpoint.

setup.py

ilab_train/config.py

RobotSail · 2024-06-12T17:36:32Z

instructlab/eval#1

russellb

I haven't reviewed the code part, just pyproject/requirements/namespace package portions. That all looks good, though I gave some suggested edits for some minor cleanup. That includes dropping Python 3.9 since I think we landed on 3.10 as the minimum.

pyproject.toml

requirements.txt

nathan-weinberg

Just focused on the Python packaging stuff, will leave the training code itself to y'all for now 😄

pyproject.toml

requirements.txt

src/instructlab/training/main_ds.py

Maxusmusti · 2024-06-14T17:15:40Z

Note: Current requirements will overwrite existing nvidia pytorch installs. Need to ensure that if those exist, we are not installing our own torch.

src/instructlab/training/main_ds.py

src/instructlab/training/config.py

src/instructlab/training/main_ds.py

src/instructlab/training/config.py

src/instructlab/training/main_ds.py

JamesKunstle

two thoughts, one being a minor nit-pick, but in general this is very good, I'll check that it runs.

README.md

src/instructlab/training/config.py

RobotSail · 2024-06-17T16:04:17Z

Thank you for the review @JamesKunstle, I've created an issue about your quantization comment here: #29

src/instructlab/training/config.py

cdoern

initial comments. Want to make sure this is aligned with CLI impl.

src/instructlab/training/main_ds.py

russellb

Can you squash the commit history here before merging?

JamesKunstle

v good thank you for all your work

russellb

biggestion question here is about the use of class variables.

In terms of PR structure -- it would have been nice to have the PR that's doing the library setup setup from all the API introduction stuff.

README.md

requirements.txt

src/instructlab/training/config.py

Maxusmusti

LGTM, tested full training (no lora/offload) with both base scripts and updated interface, both functioning as expected

setup.py

.gitignore

russellb · 2024-06-17T19:12:23Z

just to be super clear:

squash the first two commits and fix up the commit message
create a follow-up issue for cleaning up requirements.txt to specify versions for all dependencies + making sure the versions are aligned with instructlab/instructlab if it's a shared dependency

Signed-off-by: Oleg S <[email protected]> turn the training code into a library that can be invoked by another package Signed-off-by: Oleg S <[email protected]> update interface Signed-off-by: Oleg S <[email protected]> rename ilab_train -> train Signed-off-by: Oleg S <[email protected]> remove setup.py in favor of pyproject.toml Signed-off-by: Oleg S <[email protected]>

Signed-off-by: Oleg S <[email protected]>

russellb · 2024-06-17T20:35:09Z

create a follow-up issue for cleaning up requirements.txt to specify versions for all dependencies + making sure the versions are aligned with instructlab/instructlab if it's a shared dependency

We talked about this, so I know your intent was to create this issue, but I can't find it anywhere. I went ahead and filed this one: #34

russellb requested changes Jun 11, 2024

View reviewed changes

setup.py Outdated Show resolved Hide resolved

ilab_train/config.py Outdated Show resolved Hide resolved

russellb reviewed Jun 13, 2024

View reviewed changes

pyproject.toml Outdated Show resolved Hide resolved

pyproject.toml Outdated Show resolved Hide resolved

pyproject.toml Outdated Show resolved Hide resolved

pyproject.toml Outdated Show resolved Hide resolved

requirements.txt Outdated Show resolved Hide resolved

nathan-weinberg requested a review from bjhargrave June 13, 2024 14:53

nathan-weinberg requested changes Jun 13, 2024

View reviewed changes

pyproject.toml Outdated Show resolved Hide resolved

requirements.txt Outdated Show resolved Hide resolved

RobotSail force-pushed the os-add-interface branch 2 times, most recently from a0a812a to 4296d0d Compare June 13, 2024 15:32

RobotSail commented Jun 13, 2024

View reviewed changes

src/instructlab/training/main_ds.py Outdated Show resolved Hide resolved

RobotSail commented Jun 13, 2024

View reviewed changes

src/instructlab/training/main_ds.py Outdated Show resolved Hide resolved

RobotSail commented Jun 13, 2024

View reviewed changes

src/instructlab/training/main_ds.py Outdated Show resolved Hide resolved

RobotSail mentioned this pull request Jun 13, 2024

Narrow down the package versions #16

Closed

RobotSail force-pushed the os-add-interface branch 2 times, most recently from a72396a to 493e755 Compare June 14, 2024 15:03