-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bus path matcher incorrectly assumes file path suffix, causing Talos to fail to match NVMe drives #114
Comments
This is a legacy interface which we hope to get rid of soon, so probably we won't be fixing it right now, but rather move towards new implementation. |
I can fix this with a PR with the following patch if it would be accepted: if strings.HasPrefix(dev, "nvme") {
// Example: when dev = "nvme0n1", busPathSuffix = "nvme/nvme0/nvme0n1"
busPathSuffix := filepath.Join("nvme", dev[:strings.LastIndex(dev, "n")], dev)
busPath = strings.TrimSuffix(busPath, filepath.Join("nvme", dev))
} else {
busPath = strings.TrimSuffix(busPath, filepath.Join("block", dev))
} Along with an appropriate test (or tests) |
That might be a breaking change for those who rely on old, broken behavior. We have disk matchers done right now, I would rather use that. |
Sorry to clarify do you mean that v2 disk matchers are done? This is the code that feeds the available disk information into the bus path disk matcher (in v1) |
Right - this is exactly what has the problem, and what the aforementioned patch would fix. Talos is using the v1 version of this module for the disk selector code path specifically, see here and here. Talos calls |
To be clear, this bug affects the latest Talos release (v1.8.2), not just some old unsupported version. |
Yes, so the plan is to deprecate |
Is this a "we're going to replace it next week" plan, or a "someday when there's nothing better to do or higher priority" plan? I'm not the only person encountering this. I ran into two people last night in a community forum that were impacted by this. I'm not sure if compatibility or concerns about breakage are reasonable here. The logic is objectively incorrect, so fixing it should at worst only impact people who both discovered and fixed the issue on their own. Additionally, Talos v1.8 introduced several breaking changes (like bumping containerd to v2), so there is recent precedence for breaking existing setups when upgrading Talos. If nothing else, can the docs be updated to mention this bug? IMO Talos administrators shouldn't need to dig through the Talos codebase figure out what's going on and how to mitigate the problem. |
Please feel free to open a PR to fix the docs, but backporting this to as a patch release in 1.8.x sounds too much risk to me at the moment. |
When getting information about available disks, this line trims
block/<device name>
from the bus path. For example, it changes/pci0000:00/0000:00:03.2/0000:06:00.0/host11/port-11:0/expander-11:0/port-11:0:0/end_device-11:0:0/target11:0:0/11:0:0:0/block/sda
into/pci0000:00/0000:00:03.2/0000:06:00.0/host11/port-11:0/expander-11:0/port-11:0:0/end_device-11:0:0/target11:0:0/11:0:0:0/
.The problem is that not all block devices have a
block/<device name>
path. Specifically, NVMe drives have a suffix ofnvme<dev number>/nvme<dev number>n<namespace number>
. For example, for a device namednvme0n1
, the suffix would benvme0/nvme0n1
.This can be confirmed with Talos, which is affected by the aforementioned issue. The command
talosctl get disks
handles this correctly for some reason. Here's an example output for one of my NVMe drives:However, if I specify
busPath: /pci0000:00/0000:00:1d.0/0000:58:00.0/nvme
in my node configuration,talosctl apply-config
fails withno disk found matching provided parameters
. Setting it tobusPath: /pci0000:00/0000:00:1d.0/0000:58:00.0/nvme/nvme1/nvme1n1
works as expected.It should also be trivial to check the structure of this file tree by running
ls -l /sys/block/<some nvme dev>
on any Linux system with a NVMe drive installed.The text was updated successfully, but these errors were encountered: