Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug with setting the nvidia.com/vgpu.config.state label #15

Merged
merged 1 commit into from
Mar 13, 2024

Conversation

cdesiniotis
Copy link
Contributor

@cdesiniotis cdesiniotis commented Mar 13, 2024

Before this change, the state label would incorrectly have a value of 'success' even if the vgpu-device-manager failed to apply a particular configuration.

Before:

time="2023-12-22T18:31:50Z" level=fatal msg="error getting vGPU config: error getting all vGPU devices: unable to read MDEV devices directory: open /sys/bus/mdev/devices: no such file or directory"
time="2023-12-22T18:31:50Z" level=info msg="Changing the 'nvidia.com/vgpu.config.state' node label to 'success'"
time="2023-12-22T18:31:50Z" level=error msg="ERROR: unable to apply config 'A10-8Q': exit status 1"
time="2023-12-22T18:31:50Z" level=info msg="Waiting for change to 'nvidia.com/vgpu.config' label"

After:

time="2023-12-22T18:44:15Z" level=fatal msg="error getting vGPU config: error getting all vGPU devices: unable to read MDEV devices directory: open /sys/bus/mdev/devices: no such file or directory"
time="2023-12-22T18:44:15Z" level=error msg="Failed to apply vGPU config: unable to apply config 'A10-8Q': exit status 1"
time="2023-12-22T18:44:15Z" level=info msg=""Setting node label: nvidia.com/vgpu.config.state=failed"
time="2023-12-22T18:44:15Z" level=info msg="Waiting for change to 'nvidia.com/vgpu.config' label"

Before this change, the state label would incorrectly have a value
of 'success' even if the vgpu-device-manager failed to apply a particular configuration.

Signed-off-by: Christopher Desiniotis <[email protected]>
@cdesiniotis cdesiniotis merged commit b89399e into main Mar 13, 2024
1 check passed
@tariq1890 tariq1890 deleted the fix-status-label branch March 13, 2024 20:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants