Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arp unreliable on minimized images due to lack of network activity #3822

Open
Saviq opened this issue Dec 6, 2024 · 6 comments
Open

arp unreliable on minimized images due to lack of network activity #3822

Saviq opened this issue Dec 6, 2024 · 6 comments
Labels

Comments

@Saviq
Copy link
Collaborator

Saviq commented Dec 6, 2024

Describe the bug
When using snapcraft, instances time out on launch b/c they don't incur enough network activity on startup.

To Reproduce
How, and what happened?

  1. multipass launch snapcraft:core24 -n test
  2. arp -an has the IP
  3. multipass stop test prunes the IP immediately
  4. multipass start test times out
  5. ping 192.168.66.4 immediately unblocks

Expected behavior
The instance to get an IP.

Logs

[2024-12-06T18:28:03.665] [debug] [test] process working dir ''
[2024-12-06T18:28:03.665] [info] [test] process program 'qemu-system-aarch64'
[2024-12-06T18:28:03.665] [info] [test] process arguments '-machine, virt,gic-version=3, -accel, hvf, -drive, file=/Library/Application Support/com.canonical.multipass/bin/../Resources/qemu/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on, -cpu, host, -nic, vmnet-shared,model=virtio-net-pci,mac=52:54:00:cd:7c:3d, -device, virtio-scsi-pci,id=scsi0, -drive, file=/var/root/Library/Application Support/multipassd/qemu/vault/instances/test/noble-server-cloudimg-arm64-disk1.img,if=none,format=qcow2,discard=unmap,id=hda, -device, scsi-hd,drive=hda,bus=scsi0.0, -smp, 1, -m, 1024M, -qmp, stdio, -chardev, null,id=char0, -serial, chardev:char0, -nographic, -cdrom, /var/root/Library/Application Support/multipassd/qemu/vault/instances/test/cloud-init-config.iso'
[2024-12-06T18:28:03.670] [debug] [qemu-system-aarch64] [20795] started: qemu-system-aarch64 -machine virt,gic-version=3 -nographic -dump-vmstate /private/var/folders/zz/zyxvpxvq6csfxvn_n0000000000000/T/multipassd.cchTdi
[2024-12-06T18:28:03.748] [info] [test] process state changed to Starting
[2024-12-06T18:28:03.750] [info] [test] process state changed to Running
[2024-12-06T18:28:03.750] [debug] [qemu-system-aarch64] [20796] started: qemu-system-aarch64 -machine virt,gic-version=3 -accel hvf -drive file=/Library/Application Support/com.canonical.multipass/bin/../Resources/qemu/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on -cpu host -nic vmnet-shared,model=virtio-net-pci,mac=52:54:00:cd:7c:3d -device virtio-scsi-pci,id=scsi0 -drive file=/var/root/Library/Application Support/multipassd/qemu/vault/instances/test/noble-server-cloudimg-arm64-disk1.img,if=none,format=qcow2,discard=unmap,id=hda -device scsi-hd,drive=hda,bus=scsi0.0 -smp 1 -m 1024M -qmp stdio -chardev null,id=char0 -serial chardev:char0 -nographic -cdrom /var/root/Library/Application Support/multipassd/qemu/vault/instances/test/cloud-init-config.iso
[2024-12-06T18:28:03.750] [info] [test] process started

Additional info

  • OS: macOS 14.05 Sonoma on a MacBook Air M1

  • multipass   1.14.1+mac
    multipassd  1.14.1+mac
    
  • Name:           test
    State:          Running
    Snapshots:      0
    IPv4:           192.168.66.4
    Release:        Ubuntu 24.04.1 LTS
    Image hash:     64312e253c11 (Ubuntu 24.04 LTS)
    CPU(s):         1
    Load:           0.07 0.02 0.00
    Disk usage:     1.1GiB out of 4.8GiB
    Memory usage:   137.3MiB out of 953.1MiB
    Mounts:         --
    
  • qemu

Additional context

@Saviq Saviq added bug needs triage Issue needs to be triaged labels Dec 6, 2024
@ricab
Copy link
Collaborator

ricab commented Dec 6, 2024

I have looked and it looks like Apple has now fixed the source of #3661 (at least on my machine).

As discussed with @Saviq, the leases file was not super reliable either, but we should be able to use one method and fall back to the other.

@ricab ricab added medium medium importance and removed needs triage Issue needs to be triaged labels Dec 6, 2024
@ricab
Copy link
Collaborator

ricab commented Dec 6, 2024

Calling it a medium since it will show up only occasionally and you have a workaround.

@ricab
Copy link
Collaborator

ricab commented Dec 6, 2024

We should probably add this to the new troubleshooting in the meantime.

@Saviq
Copy link
Collaborator Author

Saviq commented Dec 6, 2024

Looks fine on Sequoia here:

$ sudo cat /var/db/dhcpd_leases
{
        name=test
        ip_address=192.168.66.2
        hw_address=ff,f1:f5:dd:7f:0:2:0:0:ab:11:c1:f2:8d:27:c4:a6:73:dc
        identifier=ff,f1:f5:dd:7f:0:2:0:0:ab:11:c1:f2:8d:27:c4:a6:73:dc
        lease=0x675366f3
}

@ricab
Copy link
Collaborator

ricab commented Dec 9, 2024

That's still the new format, which replaced the mac address with that id. It used to have the MAC on both hw-address and identifier, not it has that sort of hash. I have just the MAC here, but apparently different people have different things.

We could key by the name, but that's has caused issues in the past. Different VMs can have the same name (at different times), and I don't think the order of that file is reliable. I guess we could try each one until we succeed.

@Saviq
Copy link
Collaborator Author

Saviq commented Dec 9, 2024

That's still the new format, which replaced the mac address with that id.

Indeed. What a mess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants