Measure the SP on SP_RESET signal interrupt #1946

lzrd · 2024-12-13T00:48:30Z

This is a work in progress. Measurement currently takes 12.8s about 0.482 and we'd like to improve that or accommodate that delay before merging.
That's in addition to any feedback that people may have.

On SP_RESET asserted, the RoT will measure the entire active SP Hubris bank (Hubris plus 0xff padding from 0x800_0000 to 0x80f_ffff).
The measurement reliably takes about 0.482 seconds since neither the SP nor the RoT are doing anything else during this time. This low measurement time and the existing timeouts and retries used by the control plane mitigate concerns that were present when the measurement time was measured in multiple seconds.

This time could be further reduced. It is currently taking 0.245 seconds to inject the measurement program into the SP. Each u32 is being sent and read back separately and retried if there is any error (no errors have been seen as of yet).

If all is well with the SP, the measurement will match the Sha3_256 value represented in the SP Hubris archive in the file named "img/final.fwid".

Testing SP measurement

Checkout hubris branch attest-sp

APP=$PWD/app/oxide-rot-1/app-dev.toml
ARCHIVE=$(cargo -q xtask print --archive --image-name a $APP)
cargo xtask flash $APP

Measure on SP Reset:

Hit the reset button on the SP or use the SpCtrl.db_reset_sp function:

humility --archive=$ARCHIVE hiffy -c SpCtrl.db_reset_sp -a delay_ms=10
sleep 13

See traces

The ringbuf trace in swd and attest tasks have the interesting information including the time expended:

humility --archive=$ARCHIVE ringbuf

Dump the attestation log:

humility --archive=$ARCHIVE hiffy -c Attest.log -a offset=0 -n 256 -o out
hexdump -C out

The hexdump is not hard to read directly. The first four bytes are a LE u32 giving
the number of valid entries. Each entry is a 1-byte discriminator that is 0_u8 for Sha3_256 followed by the 32-byte measurement (FWID). There is space for 16 measurements.

Notes

The RoT only attempts a measurement if it detects an SP reset.
Existing measurements are cleared if an STLINK power-on is detected or if an STLINK is on during a measurement attempt. It is observed that powered-off STLINKs have no effect.
No SP reset can be captured until the swd task has initialized. If needed, the control plane can request the SP to reset itself which will result in the RoT measuring the SP.
The attest task ~~needs~~ has a reset_log_and_record function usable only by the swd task (and hiffy for debug builds).

github-actions

license-eye has totally checked 554 files.

Valid	Invalid	Ignored	Fixed
552	2	0	0

Click to see the invalid file list

lib/endoscope/src/main.rs
lib/endoscope/src/shared.rs

lib/endoscope/src/main.rs

lib/endoscope/src/shared.rs

drv/lpc55-swd/build.rs

lib/endoscope/README.md

drv/lpc55-swd/src/main.rs

drv/lpc55-swd/src/sp.rs

lib/endoscope-abi/src/lib.rs

lib/endoscope/Cargo.toml

lib/endoscope/src/main.rs

drv/lpc55-swd/build.rs

lib/endoscope/src/shared.rs

lib/endoscope/src/main.rs

task/attest/src/main.rs

task/attest/build.rs

labbott

This looks really nice. Thanks for putting in the work to finally make the RoT a reality!

Either before we merge or right afterwards I think we should makes sure this gets run through a sample manufacturing flow to make sure we don't have any surprises there.

Cargo.toml

app/oxide-rot-1/app.toml

drv/lpc55-swd/build.rs

drv/lpc55-swd/src/armv7debug.rs

drv/lpc55-swd/src/main.rs

labbott · 2025-02-24T14:29:52Z

drv/lpc55-swd/src/main.rs

+        // It takes about 0.25 seconds (236 RoT systicks) for `endoscope` to run.
+        // Allow about twice that time for the measurement to complete.
+        // endoscope executes a BKPT instruction on completion.
+        // We observe an S_HALT state if all goes well.
+        if self.halt_wait(512).is_err() {
+            ringbuf_entry!(Trace::DidNotHalt);
+            return Err(());
+        };


What do we observe if endoscope either loops forever or hits a fault?

drv/lpc55-swd/src/main.rs

labbott · 2025-02-24T14:37:08Z

drv/lpc55-swd/src/main.rs

+        self.halt_wait(5000)
+    }
+
+    fn halt_wait(&mut self, timeout: u64) -> Result<(), SpCtrlError> {


I think I confused myself because halt_wait means wait for the system to halt, not attempt to halt and then wait. Can you either add a comment to this effect or change the name?

wait_for_sp_halt would be better.

drv/lpc55-swd/src/armv7debug.rs

drv/lpc55-swd/src/main.rs

mkeeter · 2025-02-24T17:01:31Z

drv/lpc55-swd/src/main.rs

+        value: u32,
+    ) -> Result<(), SpCtrlError> {
+        // C1.6 Debug system registers
+        let r = match register {


Vibes: should we make the enum Reg include all of these values? Then, we could make this function take a register: Reg instead of having to check here.

I like this idea. I found it difficult to reason about the switch statement.

mkeeter · 2025-02-24T17:07:01Z

drv/lpc55-swd/src/main.rs

+            .map_err(|_| SpCtrlError::Fault)?;
+
+        let cnt = buf.len();
+        if cnt % 4 != 0 {


Nit: should we do this check before starting a read transaction, to avoid leaving the SP in a weird state?

mkeeter · 2025-02-24T17:12:53Z

drv/lpc55-swd/src/main.rs

+        let gpio = Pins::from(self.gpio);
+        setup_rot_to_sp_reset_l_in(self.gpio);
+        gpio.set_val(ROT_TO_SP_RESET_L_IN, Value::One); // should be a no-op
+        if squelch_notify {


I'm confused by squelch_notify here; wouldn't the IRQ have already fired in sp_reset_enter?

We're re-asserting SP_RESET and don't want to be notified about the reset that we caused. This removes the interrupt that we would have gotten on return from the handler.

Doesn't the kernel queue the notification as soon as the pin changes state? I would expect that we need a call to sys_irq_control_clear_pending here as well.

drv/lpc55-swd/src/main.rs

mkeeter · 2025-02-24T18:25:04Z

drv/lpc55-swd/src/main.rs

+        self.init = true; // we don't want GPIO pin reconfiguration.
+        if self.do_setup_swd().is_ok() {


These lines make me nervous: do_setup_swd itself sets self.init = true on success, but here we unconditionally set it. I feel like init: bool is too ambiguous; we're using it both for "are pins configured" and "are we controlling the SP over SWD". Should it instead be an enum type?

drv/lpc55-swd/src/main.rs

mkeeter · 2025-02-24T18:34:12Z

drv/lpc55-swd/src/main.rs

+        // Setting up to inject the measurement program into the SP
+        // has several potential failures. Use this `prep` closure
+        // and `need_undo` state to keep from indenting too much.
+        let mut prep = |undo: &mut Undo| -> Result<(), ()> {


Nit: you can capture need_undo automatically and don't need to pass it as an argument (so *undo in the function body would just become need_undo)

mkeeter · 2025-02-24T18:35:34Z

drv/lpc55-swd/src/main.rs

+            if self
+                .dp_write_bitflags::<Demcr>(Demcr::VC_CORERESET)
+                .is_err()
+            {
+                ringbuf_entry!(Trace::DemcrWriteError);
+                *undo |= Undo::VC_CORERESET;
+                return Err(());
+            }
+            *undo |= Undo::VC_CORERESET;


Nit: you can remove duplicate code by capturing the result, then setting undo, then checking it.

Suggested change

if self

.dp_write_bitflags::<Demcr>(Demcr::VC_CORERESET)

.is_err()

{

ringbuf_entry!(Trace::DemcrWriteError);

*undo |= Undo::VC_CORERESET;

return Err(());

}

*undo |= Undo::VC_CORERESET;

let r = self.dp_write_bitflags::<Demcr>(Demcr::VC_CORERESET);

*undo |= Undo::VC_CORERESET;

if r.is_err() {

ringbuf_entry!(Trace::DemcrWriteError);

return Err(());

}

mkeeter · 2025-02-24T18:39:58Z

drv/lpc55-swd/src/main.rs

+            // Asserting SP_RESET for >1ms here works.
+            hl::sleep_for(1);
+
+            if self.dp_write_bitflags::<Dhcsr>(Dhcsr::resume()).is_err() {


resume() is a confusing name for the function which sets debug enable

mkeeter · 2025-02-24T18:44:02Z

drv/lpc55-swd/src/main.rs

+        let digest = if prep(&mut need_undo).is_ok() {
+            self.do_measure_sp()
+        } else {
+            Err(())
+        };


Suggested change

let digest = if prep(&mut need_undo).is_ok() {

self.do_measure_sp()

} else {

Err(())

};

let digest = prep().and_then(|()| self.do_measure_sp());

(this assumes you take my suggestion above of capturing need_undo instead of passing it in)

mkeeter · 2025-02-24T18:44:47Z

drv/lpc55-swd/src/main.rs

+
+        // This read of DHCSR should never be useful given the code above.
+        // Read back to make sure that VC_CORERESET is clear.
+        // XXX remove


Is it time to remove this?

mkeeter · 2025-02-24T18:46:27Z

drv/lpc55-swd/src/main.rs

+        }
+
+        // The SP is still halted.
+        // Get it running again by toggling its RESET pin.


It would also be nice to leave a comment here explaining that the caller is responsible for clearing the pending IRQ, so we don't measure it again.

Add interrupt-related API calls to the LPC55 `gpio_driver`. A task on an LPC55 can now configue and use GPIO interrupts. app.toml example shows Pin Interrupt configuration: [tasks.foo] ... interrupts = { "pint.irq0" = "button-irq" } ... task-slots = ["gpio_driver", ...] [tasks.foo.config] pins = [ { name="BUTTON', pin={ port=1, pin=9}, alt=0, pint=0, direction="input", opendrain="normal" } ]

swd: - detects JTAG/dongle and SP_RESET - can drive SP_RESET to stop SP and initiate measurement procedure. - on JTAG dongle present or other failure, don't record a bogus FWID, just reset the log. - on SP reset detection, injects a blob to measure the SP. - on powered JTAG dongle detection, resets attestation log. - swd logs the result to attest task. - the hiffy command `db_reset_sp` is a conditional feature for debugging and testing. endoscope: An injectable code blob covered by the RoT signature. - configure STM32H7 clocks in endoscope for best performance (thanks Cliff). - Use instruction/data tightly coupled memory (ITCM/DTCM) for better performance. - Build the injectable SP measurement blob as a cargo `bindeps` artifact. attest: - Add `reset`, and `reset_and_record` to the attest task. - Only tasks listed in `[tasks.attest.config]` are allowed to reset the log. - Unauthorized tasks get Err(ClientError::AccessViolation. build/lpc55pins: - Modify build/lpc55pins to generate separate GPIO pin setup functions for named pins. This allows easier switching between pin configurations at runtime.

Also add safety command in endoscope.

…ogic.

The swd task doesn't need to perform an SP Hubris image sanity check. It measures the whole active image bank no matter what the contents. Remove some unwrap_lite() calls.

- Remove use of unwrap_lite. - Fix write/read-back/retry logic for injecting endoscope. - Add comments documenting STM32H753 vector table.

mkeeter · 2025-02-26T20:49:05Z

drv/lpc55-swd/src/main.rs

-        if self
-            .do_write_core_register(Reg::Dr.into(), sp_reset_vector | 1)
-            .is_err()
+        if let Some(sp_reset_vector) = slice_to_le_u32(&ENDOSCOPE_BYTES[4..=7])


Why did this change? I found the previous version to be more clear:

let sp_reset_vector = u32::from_le_bytes(ENDOSCOPE_BYTES[4..=7].try_into().unwrap_lite());

versus the new code, which hides the unreachable! in the else of a conditional which is always taken

lzrd mentioned this pull request Dec 13, 2024

attestation API suitable for use from (faux-)?mgs #1605

Open

lzrd force-pushed the attest-sp branch from 12d5949 to 5b84222 Compare February 3, 2025 20:50

github-actions bot reviewed Feb 3, 2025

View reviewed changes

lib/endoscope/src/main.rs Show resolved Hide resolved

lib/endoscope/src/shared.rs Show resolved Hide resolved

lzrd force-pushed the attest-sp branch 3 times, most recently from bb29475 to 64c50cc Compare February 9, 2025 22:59

lzrd force-pushed the attest-sp branch 2 times, most recently from 1dcbf79 to 2f49e0c Compare February 20, 2025 19:26

lzrd marked this pull request as ready for review February 21, 2025 18:16

lzrd requested review from cbiffle, flihp and labbott as code owners February 21, 2025 18:16

lzrd changed the title ~~WIP - Measure the SP on SP_RESET signal interrupt~~ Measure the SP on SP_RESET signal interrupt Feb 21, 2025