Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recommend disabling UNWIND_PATCH_PAC_INTO_SCS #2

Open
a13xp0p0v opened this issue Jul 6, 2024 · 4 comments
Open

Recommend disabling UNWIND_PATCH_PAC_INTO_SCS #2

a13xp0p0v opened this issue Jul 6, 2024 · 4 comments

Comments

@a13xp0p0v
Copy link
Contributor

a13xp0p0v commented Jul 6, 2024

Hello!

In a13xp0p0v/kernel-hardening-checker#105 Daniel Micay @thestinger says that UNWIND_PATCH_PAC_INTO_SCS should be disabled, because it reduces security compared to both PAC and SCS.

Quoting:

PAC is a purely probabilistic security feature which can be bypassed through brute force attacks. PAC normally has 16 bits in the default configuration with 39-bit address space and 4k pages, but it drops to 7 bits with a 48-bit address space. It's even lower in some of the other configurations. SCS is a deterministic security feature, but it lacks a way to protect the shadow stack from arbitrary writes. It's difficult to say which is better, but having both enabled is clearly better for security than only PAC.

Please see more rationale in a13xp0p0v/kernel-hardening-checker#105.

How about changing the UNWIND_PATCH_PAC_INTO_SCS recommendation?

Thanks!

@kees
Copy link
Contributor

kees commented Jul 6, 2024

It's 16 bits, but they change at every function, so PAC is pretty strong. It is, however, true that keeping both enabled is stronger. And we certainly more expensive stuff in the recommends...

Perhaps a comment about saving kernel stack space, and the trade-off when the hardware supports PAC, and then switch the recommendation to disabled for max coverage?

(There is also the question of PAC vs stack protector...)

@thestinger
Copy link

thestinger commented Jul 6, 2024

Stack allocation MTE is a far better mitigation than SSP. Even marking all the stack allocations with a single non-zero tag set on entry and unset before return would be far better than SSP. SSP is obsolete on devices with proper MTE hardware support, they simply have to start using it. It's way weaker than simply doing that most basic form of MTE stack protection. MTE stack protection is normally done with a per-unsafe-allocation tag but it can be done without that. Either way, it has deterministic protection of the return address, register spills, allocations without address taken, etc. from ones with potential overflow or use-after-scope.

PAC is only 16 bits with 39 bit address space which is inadequate for aggressive address space based mitigations for heap memory corruption in userspace. They should have just added hardware shadow stack support and

PAC doesn't obsolete SSP because it's so weak. 56 bits vs. 7 or 16 bits. You can still brute force 16 bits. It's a bad design which conflicts with other valuable security features including future larger memory tags and existing support for larger address space rather than the barely adequate 39 bits.

@kees
Copy link
Contributor

kees commented Jul 7, 2024

Just to be clear: I'd like to keep stack-protector even with PAC.

That said, we can't just compare 56 vs 16/7 bits. Ssp is 56 bits per thread, and PAC is 16 or 7 bits per caller per thread: PAC is hash(secret + return address + stack pointer). For an arbitrary read primitive, it is either as hard as ssp to recover the key or harder (needing to read thr PAC of the specific target function at the exact stack depth for the same thread).

I'm not aware of any Oracles that would allow for brute-forcing: it the PAC fails, we're about to raise a bad address exception...

@thestinger
Copy link

I do think SSP should be removed by enabling stack MTE. It doesn't have to be the full stack allocation MTE tagging each allocation, it can be per-frame MTE instead. It can provide a deterministic defense against allocations overwriting internal data by simply leaving that tagged 0 and tagging the allocations non-zero. Tagging each frame's allocations non-zero and untagging before return is simple and not high overhead. Zeroing and tagging are done together so the initial tagging barely adds a cost compared to just zeroing locals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants