-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: In-place crypto #2385
feat: In-place crypto #2385
Conversation
Only in-place encryption so far, and only for the main data path. Fixes mozilla#2246 (eventually)
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2385 +/- ##
==========================================
+ Coverage 95.26% 95.29% +0.02%
==========================================
Files 114 114
Lines 36903 37113 +210
Branches 36903 37113 +210
==========================================
+ Hits 35155 35365 +210
Misses 1742 1742
Partials 6 6 ☔ View full report in Codecov by Sentry. |
Failed Interop TestsQUIC Interop Runner, client vs. server, differences relative to 108fb8d. neqo-latest as client
neqo-latest as server
All resultsSucceeded Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
Unsupported Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
|
Benchmark resultsPerformance differences relative to e365730. decode 4096 bytes, mask ff: No change in performance detected.time: [11.793 µs 11.849 µs 11.930 µs] change: [-4.1333% -1.2448% +0.5654%] (p = 0.49 > 0.05) decode 1048576 bytes, mask ff: No change in performance detected.time: [2.9058 ms 2.9165 ms 2.9286 ms] change: [-0.1476% +0.3278% +0.8637%] (p = 0.20 > 0.05) decode 4096 bytes, mask 7f: No change in performance detected.time: [19.717 µs 19.848 µs 20.053 µs] change: [-0.5297% +0.0020% +0.5684%] (p = 0.99 > 0.05) decode 1048576 bytes, mask 7f: No change in performance detected.time: [4.7132 ms 4.7244 ms 4.7371 ms] change: [-0.2998% +0.0468% +0.4013%] (p = 0.78 > 0.05) decode 4096 bytes, mask 3f: Change within noise threshold.time: [6.2397 µs 6.2782 µs 6.3218 µs] change: [+0.1124% +0.8835% +1.6191%] (p = 0.02 < 0.05) decode 1048576 bytes, mask 3f: No change in performance detected.time: [2.1124 ms 2.1192 ms 2.1262 ms] change: [-0.6450% -0.1280% +0.3322%] (p = 0.63 > 0.05) coalesce_acked_from_zero 1+1 entries: No change in performance detected.time: [93.302 ns 93.606 ns 93.914 ns] change: [-0.9063% +0.0093% +0.7964%] (p = 0.98 > 0.05) coalesce_acked_from_zero 3+1 entries: No change in performance detected.time: [110.93 ns 111.31 ns 111.73 ns] change: [-1.3922% -0.1714% +0.7161%] (p = 0.80 > 0.05) coalesce_acked_from_zero 10+1 entries: No change in performance detected.time: [110.38 ns 110.72 ns 111.14 ns] change: [-0.3298% +0.1731% +0.7170%] (p = 0.55 > 0.05) coalesce_acked_from_zero 1000+1 entries: No change in performance detected.time: [92.566 ns 92.744 ns 92.990 ns] change: [-0.7984% -0.0234% +0.8683%] (p = 0.96 > 0.05) RxStreamOrderer::inbound_frame(): Change within noise threshold.time: [111.78 ms 111.84 ms 111.89 ms] change: [-0.1886% -0.1201% -0.0530%] (p = 0.00 < 0.05) SentPackets::take_ranges: No change in performance detected.time: [5.2445 µs 5.4303 µs 5.6259 µs] change: [-2.4306% +0.4653% +3.4569%] (p = 0.76 > 0.05) transfer/pacing-false/varying-seeds: 💚 Performance has improved.time: [37.047 ms 37.122 ms 37.195 ms] change: [-8.2698% -7.9849% -7.7178%] (p = 0.00 < 0.05) transfer/pacing-true/varying-seeds: 💚 Performance has improved.time: [37.290 ms 37.358 ms 37.426 ms] change: [-7.5052% -7.2561% -7.0183%] (p = 0.00 < 0.05) transfer/pacing-false/same-seed: 💚 Performance has improved.time: [36.858 ms 36.926 ms 36.992 ms] change: [-8.5997% -8.3275% -8.0696%] (p = 0.00 < 0.05) transfer/pacing-true/same-seed: 💚 Performance has improved.time: [37.691 ms 37.757 ms 37.824 ms] change: [-8.1079% -7.8677% -7.6187%] (p = 0.00 < 0.05) 1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: 💚 Performance has improved.time: [835.52 ms 844.39 ms 853.44 ms] thrpt: [117.17 MiB/s 118.43 MiB/s 119.69 MiB/s] change: time: [-4.4010% -2.8467% -1.1933%] (p = 0.00 < 0.05) thrpt: [+1.2077% +2.9301% +4.6036%] 1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: Change within noise threshold.time: [315.26 ms 318.59 ms 321.90 ms] thrpt: [31.065 Kelem/s 31.389 Kelem/s 31.720 Kelem/s] change: time: [-2.9558% -1.5039% -0.0329%] (p = 0.04 < 0.05) thrpt: [+0.0329% +1.5268% +3.0458%] 1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: No change in performance detected.time: [25.601 ms 25.759 ms 25.922 ms] thrpt: [38.578 elem/s 38.821 elem/s 39.061 elem/s] change: time: [-0.3503% +0.5074% +1.3684%] (p = 0.25 > 0.05) thrpt: [-1.3499% -0.5049% +0.3515%] 1-conn/1-100mb-resp/mtu-1504 (aka. Upload)/client: No change in performance detected.time: [1.8285 s 1.8459 s 1.8633 s] thrpt: [53.669 MiB/s 54.175 MiB/s 54.691 MiB/s] change: time: [-2.5204% -1.1058% +0.3205%] (p = 0.14 > 0.05) thrpt: [-0.3195% +1.1182% +2.5855%] Client/server transfer resultsTransfer of 33554432 bytes over loopback.
|
@mxinden when you have a moment, would you take a look at the borrow-checker issue in |
Took a quick look. let dcid = Self::opt(dcid_decoder.decode_cid(&mut decoder))?;
if decoder.remaining() < SAMPLE_OFFSET + SAMPLE_SIZE {
return Err(Error::InvalidPacket);
}
let header_len = decoder.offset();
return Ok((
Self {
packet_type: PacketType::Short,
dcid,
scid: None,
token: &[],
header_len,
version: None,
data,
},
&[],
&mut [],
));
I can take a deeper look and try to fix it. |
Thanks for the analysis! Wonder if we can make |
Ah, never seen this before. That would be error prone as the bytes within the range in |
The above described issue, namely that of diff --git a/neqo-transport/src/packet/mod.rs b/neqo-transport/src/packet/mod.rs
index 73b47bcc..779ca72b 100644
--- a/neqo-transport/src/packet/mod.rs
+++ b/neqo-transport/src/packet/mod.rs
@@ -563,7 +563,7 @@ pub struct PublicPacket<'a> {
/// The packet type.
packet_type: PacketType,
/// The recovered destination connection ID.
- dcid: ConnectionIdRef<'a>,
+ dcid: ConnectionId,
/// The source connection ID, if this is a long header packet.
scid: Option<ConnectionIdRef<'a>>,
/// Any token that is included in the packet (Retry always has a token; Initial sometimes That leaves us with another issue, namely rustc not being able to infer that early returns of |
Okay, I got it. Let's take a look at /// `PublicPacket` holds information from packets that is public only. This allows for
/// processing of packets prior to decryption.
pub struct PublicPacket<'a> {
/// The packet type.
packet_type: PacketType,
/// The recovered destination connection ID.
dcid: ConnectionIdRef<'a>,
/// The source connection ID, if this is a long header packet.
scid: Option<ConnectionIdRef<'a>>,
/// Any token that is included in the packet (Retry always has a token; Initial sometimes
/// does). This is empty when there is no token.
token: &'a [u8],
/// The size of the header, not including the packet number.
header_len: usize,
/// Protocol version, if present in header.
version: Option<WireVersion>,
/// A reference to the entire packet, including the header.
data: &'a [u8],
}
This pull request introduces the following change: @@ -564,7 +574,7 @@ pub struct PublicPacket<'a> {
/// Protocol version, if present in header.
version: Option<WireVersion>,
/// A reference to the entire packet, including the header.
- data: &'a [u8],
+ data: &'a mut [u8],
} While An easy fix would be to make diff --git a/neqo-transport/src/packet/mod.rs b/neqo-transport/src/packet/mod.rs
index 73b47bcc..dc85bbd0 100644
--- a/neqo-transport/src/packet/mod.rs
+++ b/neqo-transport/src/packet/mod.rs
@@ -563,12 +563,12 @@ pub struct PublicPacket<'a> {
/// The packet type.
packet_type: PacketType,
/// The recovered destination connection ID.
- dcid: ConnectionIdRef<'a>,
+ dcid: ConnectionId,
/// The source connection ID, if this is a long header packet.
- scid: Option<ConnectionIdRef<'a>>,
+ scid: Option<ConnectionId>,
/// Any token that is included in the packet (Retry always has a token; Initial sometimes
/// does). This is empty when there is no token.
- token: &'a [u8],
+ token: Vec<u8>,
/// The size of the header, not including the packet number.
header_len: usize,
/// Protocol version, if present in header. The above, plus a couple of smaller lifetime changes resolve the borrow checker issues. I will propose a commit with my local changes. |
@larseggert let me know what you think of larseggert#34. Note that it only addresses the borrow-checker issues in Happy to look at the |
FWIW, I looked at the |
39c0445
to
5c9bc3e
Compare
I actually think they might? Our API after the last round of changes is very similar. (Or I'm missing something.) |
The challenge I see is twofold:
The latter is worse for us because we're passing in a mutable slice, which can't be resized in that way (we might do something with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few gotchas here, but I like how this is shaping up.
The real question I have is: does this really make it go faster? The benchmarks show some improvements. Are those consistent enough for you to be happy? I see that the improvements aren't 100% consistent.
Co-authored-by: Martin Thomson <[email protected]> Signed-off-by: Lars Eggert <[email protected]>
Co-authored-by: Martin Thomson <[email protected]> Signed-off-by: Lars Eggert <[email protected]>
Co-authored-by: Martin Thomson <[email protected]> Signed-off-by: Lars Eggert <[email protected]>
Co-authored-by: Martin Thomson <[email protected]> Signed-off-by: Lars Eggert <[email protected]>
Signed-off-by: Lars Eggert <[email protected]>
Signed-off-by: Lars Eggert <[email protected]>
I see a few percentage points (3-5%) locally. It's not a lot; I guess those extra heap allocations aren't that costly.
|
I'm kinda surprised the |
It's quite possible that the work we do to generate a packet is still dominated by other factors. There is a non-significant improvement according to the runs, maybe try running it 10x more to get the noise down some more. |
Fixes #2246 (eventually)