You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the segment size managed by an allocator is fixed at 1 GiB. This issue proposes 2 ways to allow dynamic segment sizing, configurable at runtime. The goal is to support allocators managing entire disks/vdevs, while still allowing the current 1 GiB segment or even smaller sizes.
Problem
The fixed segment size and implementation limits flexibility in allocator design. For larger vdevs, managing allocation bitmaps for the entire device could be more efficient than handling many smaller segments, especially if/when more powerful allocators, than the current SegmentAllocator are used. However, it's worth noting that for very large vdevs (e.g., 10 TiB), the allocation bitmap itself can become quite large (320 MiB in this case), posing challenges for reading and writing during each generation update. This could also become problematic for even moderate vdev sizes like 1 TiB (32 MiB bitmap). Smaller segment sizes might also be desirable in other situations for finer-grained control.
Proposed Changes (Bitmap-based)
This approach retains the bitmap-based allocators but makes the segment size configurable.
Configuration: Introduce a new parameter in DatabaseConfiguration to specify the allocator segment size. This parameter could accept values like "128", "1024", "8192" to define the size of the segment in Bytes or KiB/MiB. A value of "full" could signal that the allocator manages the entire vdev.
The parameter could be specify the size for the whole storage configuration or just individual disks or layers.
pubtraitAllocator:Send + Sync{fnnew(bitmap:&[u8],segment_size:Block<u64>) -> SelfwhereSelf:Sized;// ... other methods}
SegmentId Handling: Remove the implicit tie between SegmentId and disk offset to reflect that a segment may now be a disk. Generalize SegmentId to reflect this.
Bitmap Storage: The way bitmaps are stored probably doesn't have to change at all because SegmentIds will still be unique. That means that the root tree can properly index them. It has to be investigated how the root tree handles large bitmaps larger than the maximum leaf size.
Data Management Layer (DML) Changes:
Update DML to handle the new configuration parameter and pass the segment size to the allocator.
Modify the allocation and deallocation methods in DML to handle both full vdev allocation and segmented allocation.
Handler Changes: The allocators HashMap can probably stay the same for similar reasons as the bitmap storage.
Challenges
Error Handling: Implement error handling for invalid segment sizes or configurations.
Testing: Develop tests to cover different segment sizes and configurations.
Proposed Changes (Serialization-based)
This approach moves away from the fixed bitmap representation, allowing allocators to manage their state using arbitrary, serializable data structures.
Configuration: Same as in the bitmap-based approach. Introduce a new parameter in DatabaseConfiguration to specify the allocator segment size, supporting values like "128", "1024", "8192" (for Bytes or KiB/MiB) and "full".
The new function now takes a segment_size parameter. allocate, allocate_at, and deallocate operate on block offsets within a segment. serialize and deserialize manage the state safed and loaded to/from disk.
SegmentId Handling: Same as in the bitmap-based approach. Generalize SegmentId to support both traditional segments and entire disks.
Allocator State Storage: Instead of storing bitmaps directly, the root tree will store the serialized allocator state. The SegmentId will continue to be used as the key. The Handler will need to serialize and deserialize allocator states.
Data Management Layer (DML) Changes:
Update DML to handle the new configuration parameter and pass the segment size to the allocator during initialization.
Modify the allocation and deallocation methods to interact with the Allocator trait's new methods.
Handler Changes:
The Handler will now need to serialize the allocator's state when storing it in the root tree and deserialize it when retrieving.
The allocators HashMap can probably stay the same.
Delayed (De-)allocation Handling: This would be the main work in this approach.
Because the internal state of allocators could be complex the current approach of directly modifying the bitmaps without the allocator knowing won't work.
This means that many parts of the allocation and deallocation logic have to be rewritten to be agnostic of the allocator internals.
It's hard to estimate what else would have to be done because the allocation logic is so deeply intertwined with the snapshots, syncing and stretches many parts of the betree crate.
Advantages of Serialization-based Approach
Flexibility: Allocators are no longer constrained to a bitmap representation, allowing for more sophisticated allocation strategies.
Extensibility: Easier to add new allocator types with different state representations.
Potential for Reduced Memory Usage: Some allocators might require less memory to track free space compared to a full bitmap.
Disadvantages of Serialization-based Approach
Implementation Time: Because large parts of the code have to be changed this would require a lot of rewriting and testing.
Complexity: Increased complexity due to serialization and deserialization. But maybe also decreased complexity because allocation bitmaps (or other new structures) aren't changed in different parts of the codebase.
Serialization Overhead: Serializing and deserializing allocator state might introduce overhead compared to the simple bitmap approach. But for allocators using just a bitmap it should be about the same.
Testing: Testing will be needed to ensure the correctness of the serialization and deserialization logic for each allocator.
Conclusion
Both approaches offer viable paths to dynamic segment sizing. The bitmap-based approach is simpler to implement but less flexible. The serialization-based approach offers greater flexibility and potential optimizations but is harder to realize. The choice between the two depends on the project's priorities and the expected evolution of the allocator implementations.
The text was updated successfully, but these errors were encountered:
Currently, the segment size managed by an allocator is fixed at 1 GiB. This issue proposes 2 ways to allow dynamic segment sizing, configurable at runtime. The goal is to support allocators managing entire disks/vdevs, while still allowing the current 1 GiB segment or even smaller sizes.
Problem
The fixed segment size and implementation limits flexibility in allocator design. For larger vdevs, managing allocation bitmaps for the entire device could be more efficient than handling many smaller segments, especially if/when more powerful allocators, than the current
SegmentAllocator
are used. However, it's worth noting that for very large vdevs (e.g., 10 TiB), the allocation bitmap itself can become quite large (320 MiB in this case), posing challenges for reading and writing during each generation update. This could also become problematic for even moderate vdev sizes like 1 TiB (32 MiB bitmap). Smaller segment sizes might also be desirable in other situations for finer-grained control.Proposed Changes (Bitmap-based)
This approach retains the bitmap-based allocators but makes the segment size configurable.
Configuration: Introduce a new parameter in
DatabaseConfiguration
to specify the allocator segment size. This parameter could accept values like "128", "1024", "8192" to define the size of the segment in Bytes or KiB/MiB. A value of "full" could signal that the allocator manages the entire vdev.The parameter could be specify the size for the whole storage configuration or just individual disks or layers.
Allocator Interface: Modify the
Allocator
trait created in feat: Implement allocator trait with example implementations #66 to accept the segment size during initialization.SegmentId Handling: Remove the implicit tie between SegmentId and disk offset to reflect that a segment may now be a disk. Generalize
SegmentId
to reflect this.Bitmap Storage: The way bitmaps are stored probably doesn't have to change at all because
SegmentId
s will still be unique. That means that the root tree can properly index them. It has to be investigated how the root tree handles large bitmaps larger than the maximum leaf size.Data Management Layer (DML) Changes:
Handler
Changes: Theallocators
HashMap can probably stay the same for similar reasons as the bitmap storage.Challenges
Proposed Changes (Serialization-based)
This approach moves away from the fixed bitmap representation, allowing allocators to manage their state using arbitrary, serializable data structures.
Configuration: Same as in the bitmap-based approach. Introduce a new parameter in
DatabaseConfiguration
to specify the allocator segment size, supporting values like "128", "1024", "8192" (for Bytes or KiB/MiB) and "full".Allocator Interface: Modify the
Allocator
trait proposed in feat: Implement allocator trait with example implementations #66 to use serialization for state management.The
new
function now takes asegment_size
parameter.allocate
,allocate_at
, anddeallocate
operate on block offsets within a segment.serialize
anddeserialize
manage the state safed and loaded to/from disk.SegmentId Handling: Same as in the bitmap-based approach. Generalize
SegmentId
to support both traditional segments and entire disks.Allocator State Storage: Instead of storing bitmaps directly, the root tree will store the serialized allocator state. The
SegmentId
will continue to be used as the key. TheHandler
will need to serialize and deserialize allocator states.Data Management Layer (DML) Changes:
Allocator
trait's new methods.Handler
Changes:Handler
will now need to serialize the allocator's state when storing it in the root tree and deserialize it when retrieving.allocators
HashMap can probably stay the same.Delayed (De-)allocation Handling: This would be the main work in this approach.
It's hard to estimate what else would have to be done because the allocation logic is so deeply intertwined with the snapshots, syncing and stretches many parts of the
betree
crate.Advantages of Serialization-based Approach
Disadvantages of Serialization-based Approach
Conclusion
Both approaches offer viable paths to dynamic segment sizing. The bitmap-based approach is simpler to implement but less flexible. The serialization-based approach offers greater flexibility and potential optimizations but is harder to realize. The choice between the two depends on the project's priorities and the expected evolution of the allocator implementations.
The text was updated successfully, but these errors were encountered: