-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Measure the SP on SP_RESET signal interrupt #1946
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
license-eye has totally checked 554 files.
Valid | Invalid | Ignored | Fixed |
---|---|---|---|
552 | 2 | 0 | 0 |
Click to see the invalid file list
- lib/endoscope/src/main.rs
- lib/endoscope/src/shared.rs
bb29475
to
64c50cc
Compare
1dcbf79
to
2f49e0c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really nice. Thanks for putting in the work to finally make the RoT a reality!
Either before we merge or right afterwards I think we should makes sure this gets run through a sample manufacturing flow to make sure we don't have any surprises there.
drv/lpc55-swd/src/main.rs
Outdated
// It takes about 0.25 seconds (236 RoT systicks) for `endoscope` to run. | ||
// Allow about twice that time for the measurement to complete. | ||
// endoscope executes a BKPT instruction on completion. | ||
// We observe an S_HALT state if all goes well. | ||
if self.halt_wait(512).is_err() { | ||
ringbuf_entry!(Trace::DidNotHalt); | ||
return Err(()); | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do we observe if endoscope either loops forever or hits a fault?
drv/lpc55-swd/src/main.rs
Outdated
self.halt_wait(5000) | ||
} | ||
|
||
fn halt_wait(&mut self, timeout: u64) -> Result<(), SpCtrlError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I confused myself because halt_wait
means wait for the system to halt, not attempt to halt and then wait. Can you either add a comment to this effect or change the name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait_for_sp_halt
would be better.
value: u32, | ||
) -> Result<(), SpCtrlError> { | ||
// C1.6 Debug system registers | ||
let r = match register { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vibes: should we make the enum Reg
include all of these values? Then, we could make this function take a register: Reg
instead of having to check here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this idea. I found it difficult to reason about the switch statement.
.map_err(|_| SpCtrlError::Fault)?; | ||
|
||
let cnt = buf.len(); | ||
if cnt % 4 != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: should we do this check before starting a read transaction, to avoid leaving the SP in a weird state?
let gpio = Pins::from(self.gpio); | ||
setup_rot_to_sp_reset_l_in(self.gpio); | ||
gpio.set_val(ROT_TO_SP_RESET_L_IN, Value::One); // should be a no-op | ||
if squelch_notify { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused by squelch_notify
here; wouldn't the IRQ have already fired in sp_reset_enter
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're re-asserting SP_RESET and don't want to be notified about the reset that we caused. This removes the interrupt that we would have gotten on return from the handler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't the kernel queue the notification as soon as the pin changes state? I would expect that we need a call to sys_irq_control_clear_pending
here as well.
self.init = true; // we don't want GPIO pin reconfiguration. | ||
if self.do_setup_swd().is_ok() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These lines make me nervous: do_setup_swd
itself sets self.init = true
on success, but here we unconditionally set it. I feel like init: bool
is too ambiguous; we're using it both for "are pins configured" and "are we controlling the SP over SWD". Should it instead be an enum
type?
// Setting up to inject the measurement program into the SP | ||
// has several potential failures. Use this `prep` closure | ||
// and `need_undo` state to keep from indenting too much. | ||
let mut prep = |undo: &mut Undo| -> Result<(), ()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: you can capture need_undo
automatically and don't need to pass it as an argument (so *undo
in the function body would just become need_undo
)
if self | ||
.dp_write_bitflags::<Demcr>(Demcr::VC_CORERESET) | ||
.is_err() | ||
{ | ||
ringbuf_entry!(Trace::DemcrWriteError); | ||
*undo |= Undo::VC_CORERESET; | ||
return Err(()); | ||
} | ||
*undo |= Undo::VC_CORERESET; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: you can remove duplicate code by capturing the result, then setting undo
, then checking it.
if self | |
.dp_write_bitflags::<Demcr>(Demcr::VC_CORERESET) | |
.is_err() | |
{ | |
ringbuf_entry!(Trace::DemcrWriteError); | |
*undo |= Undo::VC_CORERESET; | |
return Err(()); | |
} | |
*undo |= Undo::VC_CORERESET; | |
let r = self.dp_write_bitflags::<Demcr>(Demcr::VC_CORERESET); | |
*undo |= Undo::VC_CORERESET; | |
if r.is_err() { | |
ringbuf_entry!(Trace::DemcrWriteError); | |
return Err(()); | |
} |
// Asserting SP_RESET for >1ms here works. | ||
hl::sleep_for(1); | ||
|
||
if self.dp_write_bitflags::<Dhcsr>(Dhcsr::resume()).is_err() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resume()
is a confusing name for the function which sets debug enable
let digest = if prep(&mut need_undo).is_ok() { | ||
self.do_measure_sp() | ||
} else { | ||
Err(()) | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let digest = if prep(&mut need_undo).is_ok() { | |
self.do_measure_sp() | |
} else { | |
Err(()) | |
}; | |
let digest = prep().and_then(|()| self.do_measure_sp()); |
(this assumes you take my suggestion above of capturing need_undo
instead of passing it in)
|
||
// This read of DHCSR should never be useful given the code above. | ||
// Read back to make sure that VC_CORERESET is clear. | ||
// XXX remove |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it time to remove this?
} | ||
|
||
// The SP is still halted. | ||
// Get it running again by toggling its RESET pin. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would also be nice to leave a comment here explaining that the caller is responsible for clearing the pending IRQ, so we don't measure it again.
Add interrupt-related API calls to the LPC55 `gpio_driver`. A task on an LPC55 can now configue and use GPIO interrupts. app.toml example shows Pin Interrupt configuration: [tasks.foo] ... interrupts = { "pint.irq0" = "button-irq" } ... task-slots = ["gpio_driver", ...] [tasks.foo.config] pins = [ { name="BUTTON', pin={ port=1, pin=9}, alt=0, pint=0, direction="input", opendrain="normal" } ]
swd: - detects JTAG/dongle and SP_RESET - can drive SP_RESET to stop SP and initiate measurement procedure. - on JTAG dongle present or other failure, don't record a bogus FWID, just reset the log. - on SP reset detection, injects a blob to measure the SP. - on powered JTAG dongle detection, resets attestation log. - swd logs the result to attest task. - the hiffy command `db_reset_sp` is a conditional feature for debugging and testing. endoscope: An injectable code blob covered by the RoT signature. - configure STM32H7 clocks in endoscope for best performance (thanks Cliff). - Use instruction/data tightly coupled memory (ITCM/DTCM) for better performance. - Build the injectable SP measurement blob as a cargo `bindeps` artifact. attest: - Add `reset`, and `reset_and_record` to the attest task. - Only tasks listed in `[tasks.attest.config]` are allowed to reset the log. - Unauthorized tasks get Err(ClientError::AccessViolation. build/lpc55pins: - Modify build/lpc55pins to generate separate GPIO pin setup functions for named pins. This allows easier switching between pin configurations at runtime.
Also add safety command in endoscope.
The swd task doesn't need to perform an SP Hubris image sanity check. It measures the whole active image bank no matter what the contents. Remove some unwrap_lite() calls.
- Remove use of unwrap_lite. - Fix write/read-back/retry logic for injecting endoscope. - Add comments documenting STM32H753 vector table.
if self | ||
.do_write_core_register(Reg::Dr.into(), sp_reset_vector | 1) | ||
.is_err() | ||
if let Some(sp_reset_vector) = slice_to_le_u32(&ENDOSCOPE_BYTES[4..=7]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did this change? I found the previous version to be more clear:
let sp_reset_vector =
u32::from_le_bytes(ENDOSCOPE_BYTES[4..=7].try_into().unwrap_lite());
versus the new code, which hides the unreachable!
in the else
of a conditional which is always taken
This is a work in progress. Measurement currently takes 12.8s about 0.482 and we'd like to improve that or accommodate that delay before merging.That's in addition to any feedback that people may have.
On SP_RESET asserted, the RoT will measure the entire active SP Hubris bank (Hubris plus 0xff padding from 0x800_0000 to 0x80f_ffff).
The measurement reliably takes about 0.482 seconds since neither the SP nor the RoT are doing anything else during this time. This low measurement time and the existing timeouts and retries used by the control plane mitigate concerns that were present when the measurement time was measured in multiple seconds.
This time could be further reduced. It is currently taking 0.245 seconds to inject the measurement program into the SP. Each u32 is being sent and read back separately and retried if there is any error (no errors have been seen as of yet).
If all is well with the SP, the measurement will match the
Sha3_256
value represented in the SP Hubris archive in the file named "img/final.fwid".Testing SP measurement
Checkout hubris branch
attest-sp
Measure on SP Reset:
Hit the reset button on the SP or use the SpCtrl.db_reset_sp function:
humility --archive=$ARCHIVE hiffy -c SpCtrl.db_reset_sp -a delay_ms=10 sleep 13
See traces
The ringbuf trace in
swd
andattest
tasks have the interesting information including the time expended:humility --archive=$ARCHIVE ringbuf
Dump the attestation log:
humility --archive=$ARCHIVE hiffy -c Attest.log -a offset=0 -n 256 -o out hexdump -C out
The hexdump is not hard to read directly. The first four bytes are a LE u32 giving
the number of valid entries. Each entry is a 1-byte discriminator that is 0_u8 for
Sha3_256
followed by the 32-byte measurement (FWID). There is space for 16 measurements.Notes
swd
task has initialized. If needed, the control plane can request the SP to reset itself which will result in the RoT measuring the SP.attest
taskneedshas areset_log_and_record
function usable only by theswd
task (and hiffy for debug builds).