Skip to content

Latest commit

 

History

History
62 lines (48 loc) · 1.87 KB

README.md

File metadata and controls

62 lines (48 loc) · 1.87 KB

Trepl

Trepl is a generic Tiered Replication (Cidon et. al) implementation, designed to help pick replica placement of Kafka partitions and configure WADE chains. However, it can be used in any situation where you might want to adjust probability of data loss / unavailability from multiple replica failures.

Tiered Replication follows up on ideas introduced in the Copysets paper, where you'll find detailed information on motivations and use cases:

Usage

Basic Trepl usage is simple:

>>> trepl.build_copysets(['node1', 'node2', 'node3'], R=2, S=1)
[['node1', 'node2'], ['node1', 'node3']]

>>> trepl.build_copysets(['node1', 'node2', 'node3'], R=2, S=2)
[['node1', 'node2'], ['node1', 'node3'], ['node2', 'node3']]

Trepl also ships with rack and tier aware check functions:

# not rack aware
>>> trepl.build_copysets(['node1', 'node2', 'node3'], R=2, S=1)
[['node1', 'node2'], ['node1', 'node3']]

# rack aware, node1 and node2 can not share a copyset since they're in
# the same rack
>>> rack_map = { 'node1': 'rack1', 'node2': 'rack1', 'node3': 'rack3' }
>>> trepl.build_copysets(
      rack_map.keys(), R=2, S=1,
      checker=trepl.checkers.rack(rack_map),
    )
[['node1', 'node3'], ['node2', 'node3']]

# scatter width must be 2, and data must exist on at least one node in
# the backup tier
>>> primary = ['A', 'B', 'C']
>>> backup = ['d', 'e']
>>> trepl.build_copysets(
      primary + backup, R=2, S=2,
      checker=trepl.checkers.tiered(backup, 2),
    )
[['A', 'd'], ['A', 'e'], ['B', 'd'], ['B', 'e'], ['C', 'd'], ['C', 'e']]

Authors