Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

testiso: improve error when QEMU is terminated #3192

Merged
merged 2 commits into from
Nov 11, 2022

Conversation

jlebon
Copy link
Member

@jlebon jlebon commented Nov 11, 2022

Right now when QEMU gets killed during a testiso run, the error is:

FAIL: pxe-offline-install (bios + metal) (1m10.277s)
    Got EOF from completion channel, coreos-installer-test-OK expected

This is accurate but doesn't hint well enough at the underlying cause.
Rework the two spots in which we wait for virtio-serial strings to also
check if QEMU was killed to provide a better error. E.g.:

FAIL: pxe-install (bios + metal) (34.483s)
    QEMU unexpectedly exited while waiting awaiting completion: process killed

Related: coreos/fedora-coreos-tracker#1339

Add helpers to `QemuInstance` and the underlying `ExecCmd` for querying
whether the process was terminated by a signal.
Right now when QEMU gets killed during a testiso run, the error is:

    FAIL: pxe-offline-install (bios + metal) (1m10.277s)
        Got EOF from completion channel, coreos-installer-test-OK expected

This is accurate but doesn't hint well enough at the underlying cause.
Rework the two spots in which we wait for virtio-serial strings to also
check if QEMU was killed to provide a better error. E.g.:

    FAIL: pxe-install (bios + metal) (34.483s)
        QEMU unexpectedly exited while waiting awaiting completion: process killed

Related: coreos/fedora-coreos-tracker#1339
Copy link
Member

@marmijo marmijo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@dustymabe dustymabe enabled auto-merge (rebase) November 11, 2022 17:58
@@ -93,6 +96,14 @@ func (cmd *ExecCmd) Kill() error {
return err
}

func (cmd *ExecCmd) Signaled() bool {
if cmd.ProcessState == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect the Go race detector will flag this; I believe the separate goroutine that is blocking in Wait() will be the one that writes to this data. Probably hard to fix though without reworking things completely.

If we were to rework things...I think we want a single stream of events from the qemu process effectively, then we could centralize the sleep(1) thing there.

@dustymabe dustymabe disabled auto-merge November 11, 2022 18:50
@dustymabe dustymabe enabled auto-merge (rebase) November 11, 2022 18:50
@dustymabe dustymabe merged commit 13c2a7d into coreos:main Nov 11, 2022
@jlebon jlebon deleted the pr/testiso-killed branch April 22, 2023 23:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants