Skip to content

Commit

Permalink
New config options for dogstatsd generation (#666)
Browse files Browse the repository at this point in the history
* [does not build] refactor dogstatsd load generator

* Fixes remaining build issues, almost works now

* Incorrect use of choose_or_not

* Update comments and correct multi-value bound

* Satisfy clippy, make a `lading_rev` optimization target

This commit satisfies clippy in a handful of areas but most importantly adds a
`lading_rev` binary to the project. This binary will only be compiled when the
`dogstatsd_perf` feature flag is added to the build and is meant to be used for
optimizing the member generator.

Of note, we've started to discuss streaming built `Member` instances direclty to
a generator without going through a block cache. To this point we've assumed
that the member generation only needs to be "fast enough", hence all the cloning
and small string allocations and the like. The generator is _slow_ as a
result. Consider that if you compile the project now with

```
> cargo build --release --features dogstatsd_perf --bin lading_rev
...
> hyperfine
Benchmark 1: ./target/release/lading_rev
  Time (mean ± σ):     32.688 s ±  3.813 s    [User: 28.507 s, System: 2.650 s]
  Range (min … max):   30.190 s … 40.988 s    10 runs
```

Call it roughly 30ms per `Member` instance, an eternity. I'm working on a Mac so
running

```
> cargo instruments --release --features dogstatsd_perf --bin lading_rev -t time
```

does appear to show that we spend _a lot_ of time making and cloning small
strings, which is accurate to my understanding of the code as it exists
today. `AsciiString` is a particular culprit. 67% of program runtime is in
`impl payload::Generator<Tagsets>` as of this commit.

Signed-off-by: Brian L. Troutwine <[email protected]>

* Shave some runtime

This commit removes the use of `choose_multiple` in the generator
`AsciiString`. Profiling shows that in a run of `lading_rev` this generator is
responsible for ~70% of program runtime. In _that_ ~60% of runtime is
`choose_multiple`. While convenient, it does turn out that the function
implicitly allocates a little `Vec` which we do not really need here, since
we're immediately pushing into storage otherwise.

```
➜  lading git:(sopell/dogstatsd-generator-revamp) ✗ hyperfine --warmup 3 ./target/release/lading_rev
Benchmark 1: ./target/release/lading_rev
  Time (mean ± σ):     20.902 s ±  0.167 s    [User: 18.438 s, System: 2.405 s]
  Range (min … max):   20.727 s … 21.279 s    10 runs
```

Shaves ~10 seconds off.

Signed-off-by: Brian L. Troutwine <[email protected]>

* tidy up after rebase

Signed-off-by: Brian L. Troutwine <[email protected]>

* re-add lading_rev, correct defaults

Signed-off-by: Brian L. Troutwine <[email protected]>

* Adds config knobs for min and max name length

* Adds config knobs for tag key and tag value string length

* Fix clippy useless conversion warning

* Updates readme and dogstatsd example

* Suppresses clippy warnings about too-many-arguments

* Removes stale comment and debug build directive

---------

Signed-off-by: Brian L. Troutwine <[email protected]>
Co-authored-by: Brian L. Troutwine <[email protected]>
  • Loading branch information
scottopell and blt authored Aug 15, 2023
1 parent 6249fd5 commit 3a68e2e
Show file tree
Hide file tree
Showing 16 changed files with 717 additions and 937 deletions.
5 changes: 5 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,11 @@ component of a larger performance testing strategy for complex programs. The
[installation instructions](https://grpc.io/docs/protoc-installation/) from the
protobuf docs.

For criterion benchmarks, you can run them via `cargo bench`.
[`cargo-criterion`](https://github.com/bheisler/cargo-criterion)
is a more advanced cargo extension that provides
historical (ie baseline) tracking functionality.

## Operating Model

`lading` operates on three conceptual components:
Expand Down
33 changes: 33 additions & 0 deletions examples/dogstatsd-generation.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
generator:
- unix_datagram:
seed: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53,
59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131]
path: "/tmp/dsd.sock"
variant:
dogstatsd:
contexts_minimum: 1000
contexts_maximum: 8000
tags_per_msg_minimum: 50
tags_per_msg_maximum: 71
multivalue_pack_probability: 1108
multivalue_count_minimum: 2
multivalue_count_maximum: 40
kind_weights:
metric: 80
event: 10
service_check: 10
metric_weights:
count: 100
gauge: 100
timer: 20
distribution: 100
set: 20
histogram: 20
bytes_per_second: "150 Mb"
parallel_connections: 1
block_sizes: ["1Kb", "2Kb", "3Kb"]
maximum_prebuild_cache_size_bytes: "50 Mb"

blackhole:
- http:
binding_addr: "0.0.0.0:8089"
123 changes: 62 additions & 61 deletions lading/src/block.rs
Original file line number Diff line number Diff line change
@@ -1,11 +1,10 @@
use std::num::{NonZeroU32, NonZeroUsize};

use bytes::{buf::Writer, BufMut, Bytes, BytesMut};
use lading_payload as payload;
use metrics::gauge;
use rand::{prelude::SliceRandom, Rng};

use lading_payload::{Serialize, TraceAgent};

#[derive(Debug, PartialEq, Eq, Clone, Copy, thiserror::Error)]
pub enum Error {
#[error("Chunk error: {0}")]
Expand Down Expand Up @@ -101,106 +100,108 @@ where

pub(crate) fn construct_block_cache<R>(
mut rng: R,
payload: &lading_payload::Config,
payload: &payload::Config,
block_chunks: &[usize],
labels: &Vec<(String, String)>,
) -> Vec<Block>
where
R: Rng,
{
match payload {
lading_payload::Config::TraceAgent(enc) => {
payload::Config::TraceAgent(enc) => {
let ta = match enc {
lading_payload::Encoding::Json => TraceAgent::json(),
lading_payload::Encoding::MsgPack => TraceAgent::msg_pack(),
payload::Encoding::Json => payload::TraceAgent::json(),
payload::Encoding::MsgPack => payload::TraceAgent::msg_pack(),
};

construct_block_cache_inner(&mut rng, &ta, block_chunks, labels)
}
lading_payload::Config::Syslog5424 => construct_block_cache_inner(
payload::Config::Syslog5424 => construct_block_cache_inner(
&mut rng,
&lading_payload::Syslog5424::default(),
&payload::Syslog5424::default(),
block_chunks,
labels,
),
lading_payload::Config::DogStatsD(lading_payload::dogstatsd::Config {
metric_names_minimum,
metric_names_maximum,
tag_keys_minimum,
tag_keys_maximum,
payload::Config::DogStatsD(payload::dogstatsd::Config {
contexts_minimum,
contexts_maximum,
name_length_minimum,
name_length_maximum,
tag_key_length_minimum,
tag_key_length_maximum,
tag_value_length_minimum,
tag_value_length_maximum,
tags_per_msg_minimum,
tags_per_msg_maximum,
// TODO -- how can I validate user input for multivalue_pack_probability
multivalue_pack_probability,
multivalue_count_minimum,
multivalue_count_maximum,
kind_weights,
metric_weights,
metric_multivalue,
}) => {
let mn_range = *metric_names_minimum..*metric_names_maximum;
let tg_range = *tag_keys_minimum..*tag_keys_maximum;
let context_range = *contexts_minimum..*contexts_maximum;
let tags_per_msg_range = *tags_per_msg_minimum..*tags_per_msg_maximum;
let name_length_range = *name_length_minimum..*name_length_maximum;
let tag_key_length_range = *tag_key_length_minimum..*tag_key_length_maximum;
let tag_value_length_range = *tag_value_length_minimum..*tag_value_length_maximum;
let multivalue_count_range = *multivalue_count_minimum..*multivalue_count_maximum;

let serializer = lading_payload::DogStatsD::new(
mn_range,
tg_range,
let serializer = payload::DogStatsD::new(
context_range,
name_length_range,
tag_key_length_range,
tag_value_length_range,
tags_per_msg_range,
multivalue_count_range,
*multivalue_pack_probability,
*kind_weights,
*metric_weights,
metric_multivalue,
&mut rng,
);

construct_block_cache_inner(&mut rng, &serializer, block_chunks, labels)
}
lading_payload::Config::Fluent => construct_block_cache_inner(
&mut rng,
&lading_payload::Fluent::default(),
block_chunks,
labels,
),
lading_payload::Config::SplunkHec { encoding } => construct_block_cache_inner(
&mut rng,
&lading_payload::SplunkHec::new(*encoding),
block_chunks,
labels,
),
lading_payload::Config::ApacheCommon => construct_block_cache_inner(
payload::Config::Fluent => {
construct_block_cache_inner(&mut rng, &payload::Fluent::default(), block_chunks, labels)
}
payload::Config::SplunkHec { encoding } => construct_block_cache_inner(
&mut rng,
&lading_payload::ApacheCommon::default(),
&payload::SplunkHec::new(*encoding),
block_chunks,
labels,
),
lading_payload::Config::Ascii => construct_block_cache_inner(
payload::Config::ApacheCommon => construct_block_cache_inner(
&mut rng,
&lading_payload::Ascii::default(),
&payload::ApacheCommon::default(),
block_chunks,
labels,
),
lading_payload::Config::DatadogLog => {
let serializer = lading_payload::DatadogLog::new(&mut rng);
payload::Config::Ascii => {
construct_block_cache_inner(&mut rng, &payload::Ascii, block_chunks, labels)
}
payload::Config::DatadogLog => {
let serializer = payload::DatadogLog::new(&mut rng);
construct_block_cache_inner(&mut rng, &serializer, block_chunks, labels)
}
lading_payload::Config::Json => {
construct_block_cache_inner(&mut rng, &lading_payload::Json, block_chunks, labels)
payload::Config::Json => {
construct_block_cache_inner(&mut rng, &payload::Json, block_chunks, labels)
}
lading_payload::Config::Static { ref static_path } => construct_block_cache_inner(
payload::Config::Static { ref static_path } => construct_block_cache_inner(
&mut rng,
&lading_payload::Static::new(static_path),
block_chunks,
labels,
),
lading_payload::Config::OpentelemetryTraces => construct_block_cache_inner(
rng,
&lading_payload::OpentelemetryTraces,
block_chunks,
labels,
),
lading_payload::Config::OpentelemetryLogs => construct_block_cache_inner(
rng,
&lading_payload::OpentelemetryLogs,
block_chunks,
labels,
),
lading_payload::Config::OpentelemetryMetrics => construct_block_cache_inner(
rng,
&lading_payload::OpentelemetryMetrics,
&payload::Static::new(static_path),
block_chunks,
labels,
),
payload::Config::OpentelemetryTraces => {
construct_block_cache_inner(rng, &payload::OpentelemetryTraces, block_chunks, labels)
}
payload::Config::OpentelemetryLogs => {
construct_block_cache_inner(rng, &payload::OpentelemetryLogs, block_chunks, labels)
}
payload::Config::OpentelemetryMetrics => {
construct_block_cache_inner(rng, &payload::OpentelemetryMetrics, block_chunks, labels)
}
}
}

Expand All @@ -226,7 +227,7 @@ fn construct_block_cache_inner<R, S>(
labels: &Vec<(String, String)>,
) -> Vec<Block>
where
S: Serialize,
S: payload::Serialize,
R: Rng,
{
let mut block_cache: Vec<Block> = Vec::with_capacity(block_chunks.len());
Expand Down
59 changes: 0 additions & 59 deletions lading/src/payload/ascii.rs

This file was deleted.

40 changes: 0 additions & 40 deletions lading/src/payload/common.rs

This file was deleted.

Loading

0 comments on commit 3a68e2e

Please sign in to comment.