Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic Segment Sizing for Allocators #67

Open
pzittlau opened this issue Jan 15, 2025 · 0 comments
Open

Dynamic Segment Sizing for Allocators #67

pzittlau opened this issue Jan 15, 2025 · 0 comments

Comments

@pzittlau
Copy link
Contributor

Currently, the segment size managed by an allocator is fixed at 1 GiB. This issue proposes 2 ways to allow dynamic segment sizing, configurable at runtime. The goal is to support allocators managing entire disks/vdevs, while still allowing the current 1 GiB segment or even smaller sizes.

Problem

The fixed segment size and implementation limits flexibility in allocator design. For larger vdevs, managing allocation bitmaps for the entire device could be more efficient than handling many smaller segments, especially if/when more powerful allocators, than the current SegmentAllocator are used. However, it's worth noting that for very large vdevs (e.g., 10 TiB), the allocation bitmap itself can become quite large (320 MiB in this case), posing challenges for reading and writing during each generation update. This could also become problematic for even moderate vdev sizes like 1 TiB (32 MiB bitmap). Smaller segment sizes might also be desirable in other situations for finer-grained control.


Proposed Changes (Bitmap-based)

This approach retains the bitmap-based allocators but makes the segment size configurable.

  1. Configuration: Introduce a new parameter in DatabaseConfiguration to specify the allocator segment size. This parameter could accept values like "128", "1024", "8192" to define the size of the segment in Bytes or KiB/MiB. A value of "full" could signal that the allocator manages the entire vdev.
    The parameter could be specify the size for the whole storage configuration or just individual disks or layers.

  2. Allocator Interface: Modify the Allocator trait created in feat: Implement allocator trait with example implementations #66 to accept the segment size during initialization.

    pub trait Allocator: Send + Sync {
        fn new(bitmap: &[u8], segment_size: Block<u64>) -> Self
        where
            Self: Sized;
        // ... other methods
    }
  3. SegmentId Handling: Remove the implicit tie between SegmentId and disk offset to reflect that a segment may now be a disk. Generalize SegmentId to reflect this.

  4. Bitmap Storage: The way bitmaps are stored probably doesn't have to change at all because SegmentIds will still be unique. That means that the root tree can properly index them. It has to be investigated how the root tree handles large bitmaps larger than the maximum leaf size.

  5. Data Management Layer (DML) Changes:

    • Update DML to handle the new configuration parameter and pass the segment size to the allocator.
    • Modify the allocation and deallocation methods in DML to handle both full vdev allocation and segmented allocation.
  6. Handler Changes: The allocators HashMap can probably stay the same for similar reasons as the bitmap storage.

Challenges

  • Error Handling: Implement error handling for invalid segment sizes or configurations.
  • Testing: Develop tests to cover different segment sizes and configurations.

Proposed Changes (Serialization-based)

This approach moves away from the fixed bitmap representation, allowing allocators to manage their state using arbitrary, serializable data structures.

  1. Configuration: Same as in the bitmap-based approach. Introduce a new parameter in DatabaseConfiguration to specify the allocator segment size, supporting values like "128", "1024", "8192" (for Bytes or KiB/MiB) and "full".

  2. Allocator Interface: Modify the Allocator trait proposed in feat: Implement allocator trait with example implementations #66 to use serialization for state management.

    pub trait Allocator: Send + Sync {
        fn new(segment_size: Block<u64>) -> Self where Self: Sized;
    
        fn allocate(&mut self, size: Block<u32>) -> Option<u32>;
    
        fn allocate_at(&mut self, size: Block<u32>, offset: u32) -> bool;
    
        fn deallocate(&mut self, offset: u32, size: Block<u32>);
    
        fn serialize(self) -> SlicedCowBytes;
    
        fn deserialize(SlicedCowBytes) -> Result<Self, Errorr> where Self: Sized;
    
        fn size(&self) -> Block<u32>;
    }

    The new function now takes a segment_size parameter. allocate, allocate_at, and deallocate operate on block offsets within a segment. serialize and deserialize manage the state safed and loaded to/from disk.

  3. SegmentId Handling: Same as in the bitmap-based approach. Generalize SegmentId to support both traditional segments and entire disks.

  4. Allocator State Storage: Instead of storing bitmaps directly, the root tree will store the serialized allocator state. The SegmentId will continue to be used as the key. The Handler will need to serialize and deserialize allocator states.

  5. Data Management Layer (DML) Changes:

    • Update DML to handle the new configuration parameter and pass the segment size to the allocator during initialization.
    • Modify the allocation and deallocation methods to interact with the Allocator trait's new methods.
  6. Handler Changes:

    • The Handler will now need to serialize the allocator's state when storing it in the root tree and deserialize it when retrieving.
    • The allocators HashMap can probably stay the same.
  7. Delayed (De-)allocation Handling: This would be the main work in this approach.

  • Because the internal state of allocators could be complex the current approach of directly modifying the bitmaps without the allocator knowing won't work.
  • This means that many parts of the allocation and deallocation logic have to be rewritten to be agnostic of the allocator internals.

It's hard to estimate what else would have to be done because the allocation logic is so deeply intertwined with the snapshots, syncing and stretches many parts of the betree crate.

Advantages of Serialization-based Approach

  • Flexibility: Allocators are no longer constrained to a bitmap representation, allowing for more sophisticated allocation strategies.
  • Extensibility: Easier to add new allocator types with different state representations.
  • Potential for Reduced Memory Usage: Some allocators might require less memory to track free space compared to a full bitmap.

Disadvantages of Serialization-based Approach

  • Implementation Time: Because large parts of the code have to be changed this would require a lot of rewriting and testing.
  • Complexity: Increased complexity due to serialization and deserialization. But maybe also decreased complexity because allocation bitmaps (or other new structures) aren't changed in different parts of the codebase.
  • Serialization Overhead: Serializing and deserializing allocator state might introduce overhead compared to the simple bitmap approach. But for allocators using just a bitmap it should be about the same.
  • Testing: Testing will be needed to ensure the correctness of the serialization and deserialization logic for each allocator.

Conclusion

Both approaches offer viable paths to dynamic segment sizing. The bitmap-based approach is simpler to implement but less flexible. The serialization-based approach offers greater flexibility and potential optimizations but is harder to realize. The choice between the two depends on the project's priorities and the expected evolution of the allocator implementations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant