Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise segment sparse reads for external readers. #486

Open
kjnilsson opened this issue Dec 6, 2024 · 0 comments · Fixed by #487
Open

Optimise segment sparse reads for external readers. #486

kjnilsson opened this issue Dec 6, 2024 · 0 comments · Fixed by #487
Milestone

Comments

@kjnilsson
Copy link
Contributor

Is your feature request related to a problem? Please describe.

Currently when opening a segment for reading the entire index is rebuilt and kept in memory. For the default segment size of 4096 entries this index takes up ~42Kb of memory. Also external reads ought to always rebuild the index for segments that aren't full every time they are requested to read to avoid accidentally reading an overwritten entry in a prior term (all entries are appended to segments, no actual overwrites occur and resolving the correct entry to read is done at index parse time).

Hence it would make sense to profile/measure and optimise segment open and index parsing and index size.

Describe the solution you'd like

A couple of options:

Instead of a map of #{ra_index() => {ra_term(), Offset, Length, Crc}} we could compact the index for external readers to something like:

[[{ra_index(), Offset, Length}...]]

where the inner rows are of max fixed length, e.g 64 entries. this would then require a maxiumum of 64+64 operations for a read which is better than keeping a flat list but more space efficient (~24Kb) than a map.

Another option:

Parse the binary index on demand. This could result in too many file syscalls but would be a good option for segments that arent full perhaps as these need to be re parsed on every read request.

Describe alternatives you've considered

.

Additional context

.

@kjnilsson kjnilsson added this to the 2.16.0 milestone Dec 6, 2024
@kjnilsson kjnilsson reopened this Dec 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant