Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kern: Make stack zap failure less catastrophic #1947

Merged

Conversation

cbiffle
Copy link
Collaborator

@cbiffle cbiffle commented Dec 13, 2024

The kernel really wants to scribble task memory. Like, really really. Like, contains unwrap()s about it.

This is silly. It's a diagnostic tool. The kernel doesn't care if it works or not, and the application should not fail to start because of it.

So, this commit makes it optional and tolerates failure.

Is this me papering over a kernel bug? Yes, #1943 to be precise. But it's reasonable paper.

@cbiffle cbiffle requested a review from hawkw December 13, 2024 02:29
Copy link
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that a failure to scribble over the stack should probably not take down the whole system, so this looks good to me! It might be nice if there were some comments explaining why we allow this to fail in the source, for future readers? Not a huge deal though.

@@ -309,15 +309,14 @@ pub fn reinitialize(task: &mut task::Task) {
.iter()
.find(|region| region.contains(initial_stack.saturating_sub(4)))
{
let mut uslice: USlice<u32> = USlice::from_raw(
if let Ok(mut uslice) = USlice::<u32>::from_raw(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmmaybe worth a comment here to the effect of "if this fails, don't crash the whole system, as it's just for diagnostic purposes"? also, perhaps we should mention why it might fail here?

for word in zap.iter_mut() {
*word = 0xbaddcafe;
) {
let zap = task.try_write(&mut uslice).unwrap_lite();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT, the try_write call failing here would be Bad News --- if a task's stack region isn't actual memory or that memory isn't writable, something has gone seriously wrong, so I think unwrapping this is reasonable. but, that might not be immediately obvious to a reader, who would see that we allow the Uslice::from_raw call to fail, but if we can't write to it, we throw up our hands and kill the whole system, which might look weird at a glance. Maybe a comment describing the difference between the two failure modes would be worth including here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that was approximately my thinking with leaving the unwrap, but you're right that a comment would improve things.

@cbiffle
Copy link
Collaborator Author

cbiffle commented Dec 13, 2024

Comments added, PTAL.

@cbiffle cbiffle requested a review from hawkw December 13, 2024 21:32
Copy link
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments look good to me, ship it!

@cbiffle cbiffle force-pushed the cbiffle/tolerate-stack-zap-failure branch from c2951ae to bbe264a Compare December 13, 2024 22:10
@cbiffle cbiffle enabled auto-merge (rebase) December 13, 2024 22:13
@cbiffle cbiffle merged commit cf855b7 into oxidecomputer:master Dec 13, 2024
125 checks passed
@cbiffle cbiffle deleted the cbiffle/tolerate-stack-zap-failure branch December 14, 2024 00:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants