-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
google-guest-agent.service go to dead (inactive) when the VM is built with packer (image) and created with MIGs. #134
Comments
FWIW there's some further discussion/diagnosis of the underlying cause of this issue tracked in https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1938299. Seems to be a cloud-init bug. |
Thanks for the detailed report, @lborguetti |
Thanks for the report @lborguetti, I'm having a similar issue but in my case, the load balancer routes are not created and therefore I cannot send traffic to the instances. |
Thanks for the update @hopkiw I think google-guest-agent is a critical service and maybe it should have a fallback to just not depends on OS boot. In the future other dependency failures may cause the same behavior. |
Any updates on this? |
Environment
OS:
Ubuntu 20.04 LTS
Kernel:
5.11.0-1020-gcp #22~20.04.1-Ubuntu SMP Tue Sep 21 10:54:26 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
SystemD version:
systemd 245 (245.4-4ubuntu3.13)
Google Guess Agent version:
20210629.00-0ubuntu1~20.04.0
Problem
We use the packer to create images and launch with MIGs using templates. After Oct 04, 2021 we realized that images built by the packer and released with MIGs do not start the google-guest-agent service and this behavior does not allow the use of Oslogin to connect to virtual machines. With image created on Sep 17, 2021, this behavior does not occur.
The behavior only occurs on first startup by MIGs. If the MIG virtual machine with this behavior is manual restarted (shutdown -r now), the google-guest-agent service will be activated and it will be possible to connect the virtual machines using Oslogin in the next boot.
Details about debugging trying finding the root cause
The image provisioning process by the packer uses ansible and follows these order:
The unit google-guest-agent.service go to dead (inactive) state after the first reboot by packer/ansible build process and the first boot by the MIG.
Logs before the virtual machine created by MIG is restarted
systemctl status google-guest-agent.service
systemd-analyze verify google-guest-agent.service
systemd-analyze critical-chain google-guest-agent.service
google-guest-agent.service logs while the packer/ansible build process is running
note: after the VM is created by MIG there is no more log in the google-guest-agent.service until the service or VM is manual restarted.
systemd-analyze plot with the inactive (dead) state: systemd-analyze-plot-boot-problem.svg.gz
Logs after the virtual machine created by MIG is manual restarted (shutdown -r now).
systemctl status google-guest-agent.service
systemd-analyze verify google-guest-agent.service
systemd-analyze critical-chain google-guest-agent.service
systemd-analyze plot with the active (running) state: systemd-analyze-plot-boot-ok.svg.gz
google-guest-agent.service dependency graph
Reproduction steps
ubuntu-os-cloud/ubuntu-minimal-2004-lts
Workaround
/usr/bin/systemctl restart google-guest-agent
.I know this isn't the most elegant way to fix the problem.
I had the same behavior using version
20210414.00-0ubuntu1~20.04.0
of google-guest-agent.I believe it is not an agent-related issue but I don't know enough about this project to continue debugging the problem by myself
Please let me know if there is any additional information I can provide that will be helpful.
Thanks,
The text was updated successfully, but these errors were encountered: