-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Node (slot) 3 won't boot Talos OS on cm4 #13
Comments
Does the problematic pi work on other slots 1 2 4 or on the official carrier board? |
@wenyi0421 yes, perfectly and the other 2 daughterboards with cm4 show the same symptoms, all work in all slots except slot 3. And none of them work in slot 3. But since this looks like a u-boot issue, maybe I should post in issue with them. I've wrote a howto here: https://github.com/bhuism/talos-tpi2 if you want to reproduce. |
Please use the official image of the Raspberry Pi to test it, and consider replacing the CM4 Adapter V1.0 to try, this may be a hardware problem |
I've replaced the adapter+cm4 already, I have 4 adapters and 3 cm4 modules, none of them work (with Talos using u-boot) in slot 3, all of them work in all other slots. I'll try raspberry pi os |
I can confirm this same issue, with the same error state (syndrome register value 0xbf000002) on any of my CM4 modules (8GB, Wifi, eMMC) connected to the node3 slot that boots using U-Boot. In my case the version packaged in the Fedora IoT 37 image. |
Raspberry Pi OS lite boots just fine in node3 slot, I swapped cd4 and daughterboards now to get u-boot to work. |
Maybe because node3 is connected to sata, uboot wants to start from sata, but it fails.This may require modifying the uboot startup items in the flashed OS |
I'm afraid so |
I gave this an attempt by building a custom u-boot image with modified boot-order and patched the pre-built talos image with it. Maybe grub also probes SATA and fails? |
At least interesting @Daedaluz , thanks for looking into it |
Let me know if there is something to test. Happy to tinker around as well |
I poked around a little bit again, and it looks like grub actually loads the kernel / initramfs properly, but the kernel itself makes it reset. I can manually load the kernel and the initramfs without issues, the reset happens after the boot command. I kind of expected something from the kernel, like an oops or panic whatnot, but the only thing i get after boot command is:
Ideas? |
Thanks @Daedaluz, this is too hard core kernel for me, btw did you know that slot/node 2 boots fine with a sata mini pcie card inserted, with the same chipset as on the tpiv2 board? namely asmedia asm 1061 chipset. Maybe this helps? |
Different for me, doesn't boot here. |
Not much. is there something similar in a non uboot system we can compare to? Maybe report upstream at talos or uboot or both as well. |
obviously, when rk1 comes out, we definitely needs talos to boot on it |
obviously! |
Working on this a little bit, conform: https://github.com/u-boot/u-boot/blob/master/doc/develop/crash_dumps.rst I get:
|
I hit the same issue by working on alpine based image powered by uboot. In my opinion the problem is related to uboot and sata controller. Node3 is not working because has native SATA controller connected. Both, Node1 and Node2 are working. When mpcie SATA controller is connected those nodes behave the same as Node3. |
@SheGe My experience was not the same with talos, I booted talos fine with a satacontroller in the mpcie slot in node2 (and same chip a on the tpiv2 board on node 3) go figure |
this issues is also reported a sidero here: siderolabs/talos#7358 |
I make a custom rpi image with a logging/trace enabled u-boot, see log attached |
new log and map: |
I've got talos booted on node3 with a workaround, a custom u-boot.bin (and thus talos image) was needed, it's a hack though. |
@bhuism I'm facing this same issue with eMMC CM4s, but also in slots 1 & 2 (as I have mini PCIe SATA cards installed there). How did you get UART working there? Did it work out of the box? Or did you tinker with In my case (RPI debug probe @ |
@maxromanovsky uarts work out of the box, come to the discord chat, and search for serial debug, you can easily get serial to any node from the bmc command line |
@bhuism thanks! # tpi --uart=get -n 1
{
"response": [{
"uart": ""
}]
}# |
@maxromanovsky the serial ports of the cm4's are all connected to serial devices on the bmc, all 4. I've wrote something up here: https://github.com/bhuism/talos-tpi2#hardwired-bmc-serial-port-connections-to-nodes, you can use microcom or picocom of the bmc. |
Since you're already set up to build custom If it works for you, you may provide (at your option) an |
@CFSworks will do asap |
@CFSworks it does get past u-boot now and into grub, but gets stuck in booting the kernel: Booting `A - Talos v1.4.7-dirty'
after this boot loops this log is not clean btw, I use picotom from the bmc (I ssh into bmc) and the lines that come back often look garbled I tried u-boot development branche and the exact u-boot version (2023.1) talos 1.4.7 is using, incl their patches, both same result, I was using development in my patch. (this talos image with ur patch boots fine on a normal rpi4b btw) |
The pasted output is all (presumably) normal output from U-Boot. Could you log into the BMC and have this running: |
I just reproduced this boot loop on my own hardware. I'll spend some time today seeing if this new problem is a shortcoming in my U-Boot patch or a problem in a different component of Talos. |
Editing the GRUB boot entry with
...which is this panic tracked upstream in the Linux kernel bug database. This might be exacerbated by the timing of U-Boot using the PCIe RC for a while and then shutting it down later in the boot when EFI boot services are exited, but is not itself the fault of U-Boot, so I'm going to send that patch upstream now. |
serial log:
The text was updated successfully, but these errors were encountered: