-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: avoid oom snapshot #26043
base: main
Are you sure you want to change the base?
feat: avoid oom snapshot #26043
Conversation
56024af
to
2213e14
Compare
@@ -181,7 +181,7 @@ pub struct Config { | |||
#[clap( | |||
long = "gen1-duration", | |||
env = "INFLUXDB3_GEN1_DURATION", | |||
default_value = "10m", | |||
default_value = "1m", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defaulting to 1m means there are more query chunks in QueryableBuffer (10 times more), but this hasn't been an issue so far.
|
||
for chunk in snapshot_chunks { | ||
for chunk in snapshot_chunks_iter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This snapshot_chunks_iter
produces SnapshotChunk
lazily, uses the chunk to create PersistJob
and then moves it to TableBuffer
's snapshotting_chunks
. Because there's a write lock on this buffer above, it is ok to remove the key and then add it back. Previously the snapshotting_chunks
was cloned and this avoids the cloning.
persisted_files: Arc<PersistedFiles>, | ||
persisted_snapshot: Arc<Mutex<PersistedSnapshot>>, | ||
) { | ||
let iterator = PersistJobGroupedIterator::new( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This allows chunks to be grouped, since 1m gen 1 duration, it aggregates together up to 10 chunks to write a single parquet file for 10m window.
3a4a9ab
to
d85460b
Compare
} | ||
|
||
#[test_log::test(tokio::test)] | ||
async fn test_snapshot_serially_two_tables_with_varying_throughput() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pauldix - this should cover the case we discussed with 2 tables receiving different amount of writes.
9aa8fdf
to
0760147
Compare
This PR addresses the OOM issue (or reduces the chances of running into OOM when snapshotting) by doing following main changes - defaults gen 1 duration to 1m (instead of 10m) - snapshot chunks are built lazily and - sort/dedupe step itself is done serially (i.e 1 at a time) As an optimisation when _not_ forcing a snapshot it aggregates up to 10m worth of chunks and writes them in parallel assumption is given it's a normal snapshot, there is enough memory to run it. closes: #25991
- extra debug logs added - test fixes
0760147
to
19c29ab
Compare
This PR addresses the OOM issue (or reduces the chances of running into OOM when snapshotting) by doing following main changes
As an optimisation when not forcing a snapshot it aggregates up to 10m worth of chunks and writes them in parallel assumption is given it's a normal snapshot, there is enough memory to run it.
closes: #25991