Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Buggy or uncommon ACPI tables break xenopsd-xc startup (and thus XAPI's startup) #6240

Open
stormi opened this issue Jan 20, 2025 · 13 comments · May be fixed by #6249
Open

Buggy or uncommon ACPI tables break xenopsd-xc startup (and thus XAPI's startup) #6240

stormi opened this issue Jan 20, 2025 · 13 comments · May be fixed by #6249

Comments

@stormi
Copy link
Contributor

stormi commented Jan 20, 2025

A user of XCP-ng has a regression in XCP-ng 8.3 when compared to XCP-ng 8.2. XAPI doesn't start completely (the last step started is [starting up database engine], which fails on a Unix.ECONNREFUSED for put_import_metadata).

Related or not, xenopsd also fails:

Jan 17 15:35:28 v-server-1 xenopsd-xc: [debug||0 ||topology] Distances: [|[|10; 4294967295|]; [|4294967295; 4294967295|]|]
Jan 17 15:35:28 v-server-1 xenopsd-xc: [debug||0 ||topology] CPU2Node: [|0; 0; 0; 0; 0; 0; 1; 1; 1; 1; 1; 1|]
Jan 17 15:35:28 v-server-1 xenopsd-xc: [error||0 ||xenopsd] xenopsd exitted with an uncaught exception: Invalid_argument("NUMA distance from node to itself must be 10: 4294967295")
Jan 17 15:35:28 v-server-1 xenopsd-xc: [error||0 ||xenopsd] 0. Raised at Stdlib.invalid_arg in file "stdlib.ml", line 30, characters 20-45
Jan 17 15:35:28 v-server-1 xenopsd-xc: [error||0 ||xenopsd] 1. Called from Topology.NUMA.make.(fun) in file "ocaml/xenopsd/lib/topology.ml", line 196, characters 10-103
Jan 17 15:35:28 v-server-1 xenopsd-xc: [error||0 ||xenopsd] 2. Called from Stdlib__Array.iteri in file "array.ml", line 129, characters 31-51
Jan 17 15:35:28 v-server-1 xenopsd-xc: [error||0 ||xenopsd] 3. Called from Topology.NUMA.make in file "ocaml/xenopsd/lib/topology.ml", line 192, characters 4-400
Jan 17 15:35:28 v-server-1 xenopsd-xc: [error||0 ||xenopsd] 4. Called from CamlinternalLazy.force_lazy_block in file "camlinternalLazy.ml", line 31, characters 17-27
Jan 17 15:35:28 v-server-1 xenopsd-xc: [error||0 ||xenopsd] 5. Re-raised at CamlinternalLazy.force_lazy_block in file "camlinternalLazy.ml", line 36, characters 4-11
Jan 17 15:35:28 v-server-1 xenopsd-xc: [error||0 ||xenopsd] 6. Called from Domain.numa_init in file "ocaml/xenopsd/xc/domain.ml", line 851, characters 13-38
Jan 17 15:35:28 v-server-1 xenopsd-xc: [error||0 ||xenopsd] 7. Called from Xenops_server_xen.init in file "ocaml/xenopsd/xc/xenops_server_xen.ml", line 5183, characters 2-21
Jan 17 15:35:28 v-server-1 xenopsd-xc: [error||0 ||xenopsd] 8. Called from Xenopsd.main in file "ocaml/xenopsd/lib/xenopsd.ml", line 460, characters 2-42
Jan 17 15:35:28 v-server-1 xenopsd-xc: [error||0 ||xenopsd] 9. Called from Dune__exe__Xenops_xc_main in file "ocaml/xenopsd/xc/xenops_xc_main.ml", line 60, characters 2-66

It apparently gets buggy distances from the ACPI tables, raises Invalid_argument("NUMA distance from node to itself must be 10: 4294967295"), and exits.

Is xenopsd too strict here? What would have to be done to solve it?

Reference forum thread: https://xcp-ng.org/forum/topic/10244/8-3-network-troubles/12

@edwintorok
Copy link
Contributor

edwintorok commented Jan 20, 2025

Thanks, this was enabled (by default once no failures were found in the lab) to find buggy machines/buggy code ahead of enabling NUMA support in 6074aef (2023).

The code could be changed to disable NUMA support (numa-affinity-policy) when a buggy table is found
(if one value is invalid then we can't trust the rest of the table either).

For now reverting the commit should make it work again for the affected user

@edwintorok
Copy link
Contributor

Although reverting the commit might cause xenopsd to call Lazy.force from a thread, which will require the ocaml/ocaml@ed4695a backported OCaml runtime commit to avoid 'Lazy.force' from causing a segfault.
(having that patch would be a good idea anyway).

@psafont
Copy link
Member

psafont commented Jan 20, 2025

NUMAPlacement.make should return an option, IMO.

Tentative commit: 5f49ba1

@edwintorok
Copy link
Contributor

btw here is the ACPI spec https://uefi.org/specs/ACPI/6.5_A/06_Device_Configuration.html#system-locality-information-table that says (i, i) is always a value of 10.

@edwintorok
Copy link
Contributor

edwintorok commented Jan 20, 2025

@stormi would be useful to see the full xl-dmesg.out and an acpidump of the relevant SRAT/SLIT tables.
See __node_distance function in Xen: 0xff (-1) could mean an unreachable node, or a value 0-9.

If that is indeed an unreachable NUMA node (e.g. one that has no online CPUs) then we can safely ignore it and still use the rest of the distance matrix.

psafont added a commit to psafont/xen-api that referenced this issue Jan 21, 2025
Instead disable NUMA for the host

Fixes xapi-project#6240

Signed-off-by: Pau Ruiz Safont <[email protected]>
psafont added a commit to psafont/xen-api that referenced this issue Jan 21, 2025
Instead disable NUMA for the host

Fixes xapi-project#6240

Signed-off-by: Pau Ruiz Safont <[email protected]>
@stormi
Copy link
Contributor Author

stormi commented Jan 21, 2025

The tentative commit from yesterday fixed the issue for the user.

Jan 21 11:01:49 v-server-1 xenopsd-xc: [debug||0 ||topology] Distances: [|[|10; 4294967295|]; [|4294967295; 4294967295|]|]
Jan 21 11:01:49 v-server-1 xenopsd-xc: [debug||0 ||topology] CPU2Node: [|0; 0; 0; 0; 0; 0; 1; 1; 1; 1; 1; 1|]
Jan 21 11:01:49 v-server-1 xenopsd-xc: [debug||0 ||xenops] Host NUMA information: Some\x0A  { "distances" = [|[|10; 4294967295|]; [|4294967295; 4294967295|]|];\x0A    "cpu2node" = [|0; 0; 0; 0; 0; 0; 1; 1; 1; 1; 1; 1|];\x0A    "node_cpus" = [|[0; 1; 2; 3; 4; 5]; [6; 7; 8; 9; 10; 11]|] }
Jan 21 11:01:49 v-server-1 xenopsd-xc: [debug||0 ||xenops] NUMA node 0: 30836379648/36507222016 memory free
Jan 21 11:01:49 v-server-1 xenopsd-xc: [debug||0 ||xenops] NUMA node 1: 0/0 memory free
Jan 21 11:01:49 v-server-1 xenopsd-xc: [debug||0 ||xenops] xenstore is responding to requests

I've got a xen-bugtool from the user so I'll try to find the data you asked.

@stormi
Copy link
Contributor Author

stormi commented Jan 21, 2025

@edwintorok

acpidump.out.txt
xl-dmesg.out.txt

@psafont do you need tests from the user with your latest patch?

@edwintorok
Copy link
Contributor

(XEN) [0000002112563a15] SRAT: Node 0 PXM 0 [0000000000000000, 000000007fffffff]
(XEN) [0000002112564b11] SRAT: Node 0 PXM 0 [0000000100000000, 000000087fffffff]
(XEN) [00000021125670a5] SRAT: node 1 has no memory

OK, so maybe not a buggy ACPI table after all. If node 1 has no memory then we should be ignoring it. Probably quite rare to have such a system (if it has no memory, then what is its purpose?)

@andyhhp
Copy link
Collaborator

andyhhp commented Jan 21, 2025

A node is a node whether it has memory or not.

(Mis)configurations like this were very common in the AMD MagnyCours days, and one OEM shipped a lot of systems with DIMMs populated in a less-optimal configuration.

@edwintorok
Copy link
Contributor

FWIW here is the decoded SLIT and SRAT table using xxd -r and iasl -d

slit.dsl.txt
srat.dsl.txt

@edwintorok
Copy link
Contributor

ah can we also have NUMA nodes with no CPUs in it?
We should probably filter the matrix we receive to:

  • eliminate unreachable nodes (i.e. those with distance -1 in (i,i)). Although we also need to fix how we interpret the entry, because there seems to be some unsigned/signed int mismatch there (we expect the value to be unsigned I think, maybe that is wrong)
  • eliminate nodes without CPUs, or nodes without any online CPUs

@stormi stormi changed the title Buggy ACPI tables break xenopsd-xc startup (and thus XAPI's startup?) Buggy or uncommon ACPI tables break xenopsd-xc startup (and thus XAPI's startup) Jan 21, 2025
@andyhhp
Copy link
Collaborator

andyhhp commented Jan 21, 2025

ah can we also have NUMA nodes with no CPUs in it?

Yes. If you e.g. downcore to a single CPU, and the package has multiple memory controllers, then you'll get a NUMA node with IO and RAM, but no (online) CPUs. Also, manually playing with xen-hptool cpu-offline can create a configuration with no online CPUs on a node.

Furthermore, In principle, a memory controller more hops than usual away from the cores could manifest like this, but I'm not aware of a platform which looks like this naturally.

What's weird here is that the SLIT declares a single locality, while the SRAT lists two proximity domains. I expect Xen has filled in the blanks with -1, including part of the leading diagonal of the matrix.

@psafont
Copy link
Member

psafont commented Jan 22, 2025

@psafont do you need tests from the user with your latest patch?

If they're up for it, sure. there was a bug in the previous one

psafont added a commit to psafont/xen-api that referenced this issue Jan 24, 2025
Instead disable NUMA for the host

Fixes xapi-project#6240

Signed-off-by: Pau Ruiz Safont <[email protected]>
psafont added a commit to psafont/xen-api that referenced this issue Jan 24, 2025
…nce matrices

Instead disable NUMA for the host

Fixes xapi-project#6240

Signed-off-by: Pau Ruiz Safont <[email protected]>
psafont added a commit to psafont/xen-api that referenced this issue Jan 24, 2025
…nce matrices

Instead disable NUMA for the host

Fixes xapi-project#6240

Signed-off-by: Pau Ruiz Safont <[email protected]>
psafont added a commit to psafont/xen-api that referenced this issue Jan 27, 2025
…nce matrices

Instead disable NUMA for the host

Fixes xapi-project#6240

Signed-off-by: Pau Ruiz Safont <[email protected]>
psafont added a commit to psafont/xen-api that referenced this issue Jan 27, 2025
…nce matrices

Instead disable NUMA for the host

Fixes xapi-project#6240

Signed-off-by: Pau Ruiz Safont <[email protected]>
psafont added a commit to psafont/xen-api that referenced this issue Jan 30, 2025
…nce matrices

Instead disable NUMA for the host

Fixes xapi-project#6240

Signed-off-by: Pau Ruiz Safont <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants