You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Currently when opening a segment for reading the entire index is rebuilt and kept in memory. For the default segment size of 4096 entries this index takes up ~42Kb of memory. Also external reads ought to always rebuild the index for segments that aren't full every time they are requested to read to avoid accidentally reading an overwritten entry in a prior term (all entries are appended to segments, no actual overwrites occur and resolving the correct entry to read is done at index parse time).
Hence it would make sense to profile/measure and optimise segment open and index parsing and index size.
Describe the solution you'd like
A couple of options:
Instead of a map of #{ra_index() => {ra_term(), Offset, Length, Crc}} we could compact the index for external readers to something like:
[[{ra_index(), Offset, Length}...]]
where the inner rows are of max fixed length, e.g 64 entries. this would then require a maxiumum of 64+64 operations for a read which is better than keeping a flat list but more space efficient (~24Kb) than a map.
Another option:
Parse the binary index on demand. This could result in too many file syscalls but would be a good option for segments that arent full perhaps as these need to be re parsed on every read request.
Describe alternatives you've considered
.
Additional context
.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
Currently when opening a segment for reading the entire index is rebuilt and kept in memory. For the default segment size of 4096 entries this index takes up ~42Kb of memory. Also external reads ought to always rebuild the index for segments that aren't full every time they are requested to read to avoid accidentally reading an overwritten entry in a prior term (all entries are appended to segments, no actual overwrites occur and resolving the correct entry to read is done at index parse time).
Hence it would make sense to profile/measure and optimise segment open and index parsing and index size.
Describe the solution you'd like
A couple of options:
Instead of a map of
#{ra_index() => {ra_term(), Offset, Length, Crc}}
we could compact the index for external readers to something like:where the inner rows are of max fixed length, e.g 64 entries. this would then require a maxiumum of 64+64 operations for a read which is better than keeping a flat list but more space efficient (~24Kb) than a map.
Another option:
Parse the binary index on demand. This could result in too many file syscalls but would be a good option for segments that arent full perhaps as these need to be re parsed on every read request.
Describe alternatives you've considered
.
Additional context
.
The text was updated successfully, but these errors were encountered: