Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gimlet-inspector: first draft. #1597

Merged
merged 1 commit into from
Jan 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 39 additions & 17 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,7 @@ zip = { version = "0.6", default-features = false, features = ["bzip2"] }
attest-data = { git = "https://github.com/oxidecomputer/dice-util", default-features = false, version = "0.1.0" }
dice-mfg-msgs = { git = "https://github.com/oxidecomputer/dice-util", default-features = false, version = "0.2.1" }
gateway-messages = { git = "https://github.com/oxidecomputer/management-gateway-service", default-features = false, features = ["smoltcp"] }
gimlet-inspector-protocol = { git = "https://github.com/oxidecomputer/gimlet-inspector-protocol", version = "0.1.0" }
hif = { git = "https://github.com/oxidecomputer/hif", default-features = false }
humpty = { git = "https://github.com/oxidecomputer/humpty", default-features = false, version = "0.1.3" }
hubtools = { git = "https://github.com/oxidecomputer/hubtools", default-features = false, version = "0.4.1" }
Expand Down
17 changes: 17 additions & 0 deletions app/gimlet/base.toml
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,16 @@ copy-to-archive = ["register_defs"]
fpga_image = "fpga-b.bin"
register_defs = "gimlet-regs-b.json"

[tasks.gimlet_inspector]
name = "task-gimlet-inspector"
priority = 6
features = ["vlan"]
max-sizes = {flash = 16384, ram = 2048 }
stacksize = 1600
start = true
task-slots = ["net", {seq = "gimlet_seq"}]
notifications = ["socket"]

[tasks.hash_driver]
name = "drv-stm32h7-hash-server"
features = ["h753"]
Expand Down Expand Up @@ -1271,6 +1281,13 @@ port = 11113
tx = { packets = 3, bytes = 1024 }
rx = { packets = 3, bytes = 1024 }

[config.net.sockets.inspector]
kind = "udp"
owner = {name = "gimlet_inspector", notification = "socket"}
port = 23547
tx = { packets = 3, bytes = 1024 }
rx = { packets = 3, bytes = 512 }

[config.sprot]
# ROT_IRQ (af=0 for GPIO, af=15 when EXTI is implemneted)
rot_irq = { port = "E", pin = 3, af = 0}
28 changes: 28 additions & 0 deletions task/gimlet-inspector/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
[package]
name = "task-gimlet-inspector"
version = "0.1.0"
edition = "2021"

[dependencies]
serde = { workspace = true }
hubpack = { workspace = true }
gimlet-inspector-protocol = { workspace = true }

task-net-api = { path = "../net-api" }
drv-gimlet-seq-api = { path = "../../drv/gimlet-seq-api" }
userlib = { path = "../../sys/userlib", features = ["panic-messages"] }


[build-dependencies]
build-util = { path = "../../build/util" }

[features]
vlan = ["task-net-api/vlan"]

# This section is here to discourage RLS/rust-analyzer from doing test builds,
# since test builds don't work for cross compilation.
[[bin]]
name = "task-gimlet-inspector"
test = false
doctest = false
bench = false
8 changes: 8 additions & 0 deletions task/gimlet-inspector/build.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
// This Source Code Form is subject to the terms of the Mozilla Public
// License, v. 2.0. If a copy of the MPL was not distributed with this
// file, You can obtain one at https://mozilla.org/MPL/2.0/.

fn main() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
build_util::build_notifications()?;
Ok(())
}
152 changes: 152 additions & 0 deletions task/gimlet-inspector/src/main.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
// This Source Code Form is subject to the terms of the Mozilla Public
// License, v. 2.0. If a copy of the MPL was not distributed with this
// file, You can obtain one at https://mozilla.org/MPL/2.0/.

//! The Gimlet Inspector provides a deliberately limited set of IPCs over the
//! network, for extracting diagnostic data from a live system. This is intended
//! to supplement the more general `dump_agent`.

#![no_std]
#![no_main]

use core::sync::atomic::{AtomicUsize, Ordering};
use drv_gimlet_seq_api::{SeqError, Sequencer};
use gimlet_inspector_protocol::{
QueryV0, Request, SequencerRegistersResponseV0, ANY_RESPONSE_V0_MAX_SIZE,
REQUEST_TRAILER,
};
use hubpack::SerializedSize;
use task_net_api::*;
use userlib::*;

task_slot!(NET, net);
task_slot!(SEQ, seq);

#[no_mangle]
static CTR_RECVD: AtomicUsize = AtomicUsize::new(0);
#[no_mangle]
static CTR_REJECTED: AtomicUsize = AtomicUsize::new(0);
#[no_mangle]
static CTR_RESPONSES: AtomicUsize = AtomicUsize::new(0);

#[export_name = "main"]
fn main() -> ! {
// Look up our peer task IDs and make clients.
let net = Net::from(NET.get_task_id());
let seq = Sequencer::from(SEQ.get_task_id());

const SOCKET: SocketName = SocketName::inspector;

loop {
// These buffers are currently kept kinda small because our protocol
// messages are small.
let mut rx_data_buf = [0u8; Request::MAX_SIZE + REQUEST_TRAILER];
let mut tx_data_buf = [0u8; ANY_RESPONSE_V0_MAX_SIZE];

match net.recv_packet(
SOCKET,
LargePayloadBehavior::Discard,
&mut rx_data_buf,
) {
Ok(mut meta) => {
CTR_RECVD.fetch_add(1, Ordering::Relaxed);

let Ok((request, _trailer)) = hubpack::deserialize::<Request>(&rx_data_buf) else {
// We ignore malformatted, truncated, etc. packets.
cbiffle marked this conversation as resolved.
Show resolved Hide resolved
CTR_REJECTED.fetch_add(1, Ordering::Relaxed);
continue;
};

match request {
Request::V0(QueryV0::SequencerRegisters) => {
let result = seq.read_fpga_regs();
let (resp, trailer) = match result {
Ok(regs) => (
SequencerRegistersResponseV0::Success,
Some(regs),
),
Err(SeqError::ServerRestarted) => (
SequencerRegistersResponseV0::SequencerTaskDead,
None,
),
Err(_) => {
// The SeqError type represents a mashing
// together of all possible errors for all
// possible sequencer IPC operations. The only
// one we _expect_ here is ReadRegsFailed.
(SequencerRegistersResponseV0::SequencerReadRegsFailed, None)
}
};
let mut len =
hubpack::serialize(&mut tx_data_buf, &resp)
.unwrap_lite();
if let Some(t) = trailer {
tx_data_buf[len..len + t.len()].copy_from_slice(&t);
len += t.len();
}
meta.size = len as u32;
}
}

// With the response packet prepared, we may need to attempt
// sending more than once.
loop {
match net.send_packet(
SOCKET,
meta,
&tx_data_buf[0..(meta.size as usize)],
) {
Ok(()) => {
CTR_RESPONSES.fetch_add(1, Ordering::Relaxed);
break;
}
// If `net` just restarted, immediately retry our send.
Err(SendError::ServerRestarted) => continue,
// If our tx queue is full, wait for space. This is the
// same notification we get for incoming packets, so we
// might spuriously wake up due to an incoming packet
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for my own education: what happens to the incoming packet in this case?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It hangs out in our RX queue until popped.

I actually just had to go read the netstack code to answer the anticipated follow-on question: so what determines when we get the next RX notification? The answer is, it depends on event flow in the netstack, and we are probably currently spamming any tasks that get into this state with RX notifications. I will file a bug.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, interesting. seems like a good catch!

// (which we can't service anyway because we are still
// waiting to respond to a previous request); once we
// finally succeed in sending we'll peel any queued
// packets off our recv queue at the top of our main
// loop.
Err(SendError::QueueFull) => {
sys_recv_closed(
&mut [],
notifications::SOCKET_MASK,
TaskId::KERNEL,
)
.unwrap_lite();
}
// These errors should be impossible if we're configured
// correctly.
Err(SendError::NotYours | SendError::InvalidVLan) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of curiosity: in these cases where we panic due to one of multiple unhandled errors, are we intentionally not indicating which it is in the panic to avoid putting more strings in the binary, or something? naively, it seems like it would be nice to know which of the two “things that shouldn’t happen” happened, on the off chance one of them does happen.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we intentionally not indicating which it is in the panic to avoid putting more strings in the binary, or something?

Exactly. We get the line number, which I'm hoping will suffice in the event that this happens. (What we really ought to do is divide up the error type to just the variants that can happen for a given operation.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that makes sense --- refining the error types so that each operation returns an enum of just the variants that it can actually return seems like a good idea! i could, potentially, pick that up, given some pointers?

unreachable!()
}
// Unclear under what conditions we could sse `Other` -
// just panic for now? At the time of this writing
// `Other` should only come back if the destination
// address in `meta` is bogus or our socket is closed,
// neither of which should be possible here.
Err(SendError::Other) => panic!(),
}
}
}
Err(RecvError::QueueEmpty) => {
// Our incoming queue is empty. Wait for more packets.
sys_recv_closed(
&mut [],
notifications::SOCKET_MASK,
TaskId::KERNEL,
)
.unwrap_lite();
}
Err(RecvError::ServerRestarted) => {
// `net` restarted (probably due to the watchdog); just retry.
}
Err(RecvError::NotYours | RecvError::Other) => panic!(),
}
}
}

include!(concat!(env!("OUT_DIR"), "/notifications.rs"));
Loading