-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite training component using kubeflow-training library #231
Conversation
@Shreyanand be sure to run |
@MichaelClifford I did run it and it doesn't change any file but somehow the pre-commit test fails 🤔 |
It's the imports that are failing so you need to run: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It’s more of a stylistic preference, but since the current function is quite bloated, breaking it into smaller sub-functions to handle the creation of master and worker containers could enhance readability. Feel free to implement or not.
Dismissing stale change requests... Seems like adding a new review with "comment" action is not enough here. 🤷
/hold till the official RHOAI python image is released. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice job!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a nit. Can we get this one formatted better:
https://github.com/opendatahub-io/ilab-on-ocp/pull/231/files#r1887046992
Otherwise LGTM 🙂
503492a
to
67532f0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
Update launcher component with correct params Remove namespace argument Fix formatting Add PR suggestions Add changes for RHEL AI image 1.3 Add changes for RHEL AI image 1.3 Add ruff changes Fix pipeline errors Add RHEL 1.3.1 image Add ruff changes Add uv make pipeline.yaml Change launcher image and get logs func Add make pipeline changes Add latest RHOAI python image Fix rebase errors Change formatting errors and add right image shas Rebase over latest changes Signed-off-by: Shreyanand <[email protected]>
@tumido @HumairAK @MichaelClifford |
This PR changes the
pytorch_manifest_op
topytorch_job_launcher_op
with the following changes:To do