Trepl

Trepl is a generic Tiered Replication (Cidon et. al) implementation, designed to help pick replica placement of Kafka partitions and configure WADE chains. However, it can be used in any situation where you might want to adjust probability of data loss / unavailability from multiple replica failures.

Tiered Replication follows up on ideas introduced in the Copysets paper, where you'll find detailed information on motivations and use cases:

Usage

Basic Trepl usage is simple:

>>> trepl.build_copysets(['node1', 'node2', 'node3'], R=2, S=1)
[['node1', 'node2'], ['node1', 'node3']]

>>> trepl.build_copysets(['node1', 'node2', 'node3'], R=2, S=2)
[['node1', 'node2'], ['node1', 'node3'], ['node2', 'node3']]

Trepl also ships with rack and tier aware check functions:

# not rack aware
>>> trepl.build_copysets(['node1', 'node2', 'node3'], R=2, S=1)
[['node1', 'node2'], ['node1', 'node3']]

# rack aware, node1 and node2 can not share a copyset since they're in
# the same rack
>>> rack_map = { 'node1': 'rack1', 'node2': 'rack1', 'node3': 'rack3' }
>>> trepl.build_copysets(
      rack_map.keys(), R=2, S=1,
      checker=trepl.checkers.rack(rack_map),
    )
[['node1', 'node3'], ['node2', 'node3']]

# scatter width must be 2, and data must exist on at least one node in
# the backup tier
>>> primary = ['A', 'B', 'C']
>>> backup = ['d', 'e']
>>> trepl.build_copysets(
      primary + backup, R=2, S=2,
      checker=trepl.checkers.tiered(backup, 2),
    )
[['A', 'd'], ['A', 'e'], ['B', 'd'], ['B', 'e'], ['C', 'd'], ['C', 'e']]

Authors

Wes Chow
Matt Krukowski

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Trepl

Usage

Authors

Files

README.md

Latest commit

History

README.md

File metadata and controls

Trepl

Usage

Authors