Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gimlet-inspector: first draft. #1597

Merged
merged 1 commit into from
Jan 26, 2024
Merged

gimlet-inspector: first draft. #1597

merged 1 commit into from
Jan 26, 2024

Conversation

cbiffle
Copy link
Collaborator

@cbiffle cbiffle commented Jan 25, 2024

No description provided.

@cbiffle cbiffle requested a review from mkeeter January 25, 2024 01:07
Cargo.toml Outdated
@@ -116,6 +116,7 @@ zip = { version = "0.6", default-features = false, features = ["bzip2"] }
attest-data = { git = "https://github.com/oxidecomputer/dice-util", default-features = false, version = "0.1.0" }
dice-mfg-msgs = { git = "https://github.com/oxidecomputer/dice-util", default-features = false, version = "0.2.1" }
gateway-messages = { git = "https://github.com/oxidecomputer/management-gateway-service", default-features = false, features = ["smoltcp"] }
gimlet-inspector-protocol = { git = "https://github.com/oxidecomputer/gimlet-inspector-protocol", version = "0.1.0", branch = "initial-protocol" }
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line will need to change before merge, once the protocol PR is merged over there.

Copy link
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall, this is very straightforward, but i left questions on a couple things i was curious about.

task/gimlet-inspector/src/main.rs Outdated Show resolved Hide resolved
Err(SendError::ServerRestarted) => continue,
// If our tx queue is full, wait for space. This is the
// same notification we get for incoming packets, so we
// might spuriously wake up due to an incoming packet
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for my own education: what happens to the incoming packet in this case?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It hangs out in our RX queue until popped.

I actually just had to go read the netstack code to answer the anticipated follow-on question: so what determines when we get the next RX notification? The answer is, it depends on event flow in the netstack, and we are probably currently spamming any tasks that get into this state with RX notifications. I will file a bug.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, interesting. seems like a good catch!

}
// These errors should be impossible if we're configured
// correctly.
Err(SendError::NotYours | SendError::InvalidVLan) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of curiosity: in these cases where we panic due to one of multiple unhandled errors, are we intentionally not indicating which it is in the panic to avoid putting more strings in the binary, or something? naively, it seems like it would be nice to know which of the two “things that shouldn’t happen” happened, on the off chance one of them does happen.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we intentionally not indicating which it is in the panic to avoid putting more strings in the binary, or something?

Exactly. We get the line number, which I'm hoping will suffice in the event that this happens. (What we really ought to do is divide up the error type to just the variants that can happen for a given operation.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that makes sense --- refining the error types so that each operation returns an enum of just the variants that it can actually return seems like a good idea! i could, potentially, pick that up, given some pointers?

@cbiffle cbiffle enabled auto-merge (rebase) January 26, 2024 19:04
@cbiffle cbiffle merged commit 02cc434 into master Jan 26, 2024
77 checks passed
@cbiffle cbiffle deleted the gimlet-inspector branch January 26, 2024 19:30
hawkw added a commit that referenced this pull request Jan 27, 2024
Inspired by [this comment][1] from @cbiffle.

Currently, the Gimlet Sequencer API has one error enum that's returned
by all API methods. This enum contains a bunch of error variants that
are specific to actually setting the sequencer state, and a
`ReadRegsFailed` variant that's returned by the `read_fpga_regs` API
method. This means that, currently, calls to set the sequencer state
have to handle the `ReadRegsFailed` variant, even though it's never
actually returned by the code that sets the sequencer state, and,
conversely, code that reads the FPGA registers has to handle a bunch of
error variants that aren't related to FPGA register reads.

This commit separates the error types of the `read_fpga_regs` and
`set_state` APIs into two separate enums that only contain the errors
returned by that API. I didn't touch the other API methods, although (as
far as I can tell), `get_state`, `fans_on`, and `fans_off` never
actually return any errors and could be changed to
`core::convert::Infallible` a la `send_hardware_nmi`. I didn't want to
mess with these, though, based on the assumption that the `SeqError`
error type was chosen to avoid having to change the API if these methods
changed to return errors. Let me know if that's not the case --- I'm
happy to change them as well.

[1]: #1597 (comment)
mkeeter added a commit that referenced this pull request Jan 29, 2024
#1597 bumped `syn` and `proc-macro2`.  For some reason, this slightly changes
the size of the `sprot` task, so `master` is not building for me right now:

```console
Error: task sprot: needs 47360 bytes of flash but max-sizes limits it to 47328
```

[Last time I investigated](https://matrix.to/#/!jxnlDLnJsVaIRstnrt:oxide.computer/$pCYj2H5LWUDkQwx_svMii1GZiOkXxpHA26BOjJzXPBA?via=oxide.computer&via=unix.house&via=matrix.org),
I came to the conclusion that bumping `proc-macro2` caused LLVM functions to be
emitted in a different order, which changed inlining behavior, which pushed one
task over the size limit (and there were literally no changes in the generated
LLVM IR).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants