Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add new GPU runner for E2E job, incorporate unit tests into existing runner #71

Closed
wants to merge 1 commit into from

Conversation

nathan-weinberg
Copy link
Member

@nathan-weinberg nathan-weinberg commented Jul 16, 2024

This PR does the following:

  • Adds a new GPU runner job (AWS-based) we can use for full E2E runs (current runner was just sanity checking integration)
  • Changes existing GPU runner to run our unit test suite
  • Some modifications to unit tests so they can be run in CI

Resolves #12

@nathan-weinberg nathan-weinberg force-pushed the gpu-runner branch 3 times, most recently from 7b5a2d5 to dfd6430 Compare July 16, 2024 14:53
@nathan-weinberg nathan-weinberg force-pushed the gpu-runner branch 3 times, most recently from 055be45 to 9707f67 Compare July 16, 2024 15:59
@nathan-weinberg nathan-weinberg changed the title [WIP] Add GPU runner job [WIP] Add new GPU runner for E2E job, incorporate unit tests into existing runner Jul 16, 2024
@nathan-weinberg nathan-weinberg force-pushed the gpu-runner branch 3 times, most recently from ab0af26 to 4f4be69 Compare July 16, 2024 17:20
@@ -1,6 +1,7 @@
# eval

![Lint](https://github.com/instructlab/eval/actions/workflows/lint.yml/badge.svg?branch=main)
![Test](https://github.com/instructlab/eval/actions/workflows/test.yml/badge.svg?branch=main)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nathan-weinberg on line 38, 39 of this file can you mention how to run the tests with pytest individually and as a group?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes!

python3 -m pip install .
python3.11 -m pip install .
# start llama-cpp server
ilab model download --repository instructlab/granite-7b-lab-GGUF --filename granite-7b-lab-Q4_K_M.gguf
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since what we want to do with ilab is pretty basic, I'd suggest we just install instructlab from pypi.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would agree except the current pypi package is pretty out-of-date - using it now is just going to require us to change a bunch of stuff when the next release comes out - but once 0.18.0 lands that'll make sense

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we should install from pypi, installing instructlab from pypi won't change the ilab model download command

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nathan-weinberg we can install the beta releases with --pre and then remove that once 0.18.0 lands

python3.11 -m pip install .
# start llama-cpp server
ilab model download --repository instructlab/granite-7b-lab-GGUF --filename granite-7b-lab-Q4_K_M.gguf
ilab model serve --model-path /home/runner/.local/share/instructlab/models/granite-7b-lab-Q4_K_M.gguf
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the functional tests scripts we shut down the server to clean up. I wonder if we should split all of this into a shell script that manages ilab installation, server startup, pytest running, and server shutdown.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really, we shouldn't need ilab at all to run unit or functional tests for this library - it should operate independently of the CLI

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm gonna look into some other possible approaches

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really, we shouldn't need ilab at all to run unit or functional tests for this library - it should operate independently of the CLI

I think this makes sense for unit tests, but functional ones should prob still use a server from ilab etc

Copy link
Member

@alinaryan alinaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff

python3 -m pip install .
python3.11 -m pip install .
# start llama-cpp server
ilab model download --repository instructlab/granite-7b-lab-GGUF --filename granite-7b-lab-Q4_K_M.gguf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we should install from pypi, installing instructlab from pypi won't change the ilab model download command

python3.11 -m pip install .
# start llama-cpp server
ilab model download --repository instructlab/granite-7b-lab-GGUF --filename granite-7b-lab-Q4_K_M.gguf
ilab model serve --model-path /home/runner/.local/share/instructlab/models/granite-7b-lab-Q4_K_M.gguf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really, we shouldn't need ilab at all to run unit or functional tests for this library - it should operate independently of the CLI

I think this makes sense for unit tests, but functional ones should prob still use a server from ilab etc

@@ -53,6 +53,7 @@ coverage.xml
.hypothesis/
.pytest_cache/
cover/
eval_output/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

taxonomy too?

- name: Install ilab
run: |
export CUDA_HOME="/usr/local/cuda"
export LD_LIBRRY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LIBRRY :)

# Install the local version of eval before installing the CLI so PR changes are included
python3.11 -m pip install .

python3.11 -m pip install instructlab
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Question) AFAIU this assumes that instructlab package never caps / pins the eval library in an incompatible way. Otherwise, this line could revert the eval package to the one from pypi. Is it an acceptable assumption?

Use previous GPU runner for unit tests

Signed-off-by: Nathan Weinberg <[email protected]>
@mergify mergify bot added CI/CD Affects CI/CD configuration documentation Improvements or additions to documentation testing Relates to testing ci-failure labels Sep 24, 2024
@nathan-weinberg
Copy link
Member Author

closing in favor of #155

@nathan-weinberg nathan-weinberg deleted the gpu-runner branch November 4, 2024 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI/CD Affects CI/CD configuration ci-failure documentation Improvements or additions to documentation testing Relates to testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add test runner to CI
6 participants