Douglas-Peucker line simplification (data reduction).
Reduces the number of points in a two-dimensional dataset, while preserving its most striking features.
The resulting dataset is a subset of the original dataset.
Although line simplification is typically used for geographical data, e.g. when zooming a digital map (see e.g. Django's GEOSGeometry.simplify() based on GEOS), this type of algorithm can also be applied to general data reduction problems, as an alternative (or addition) to conventional filtering or subsampling. Some examples:
- creating miniature data plots
- pre-processing time-series data for feature detection (e.g. peak detection)
Normal installation:
pip install dopelines
With plot support (adds matplotlib
):
pip install dopelines[plot]
With development tools:
pip install dopelines[dev]
Note: The PyPi project is called dopelines
instead of dope
, because PyPi would not let us create a project named dope
, even though the name appears to be available.
from dope import DoPeR
data_original = [
[0, 0], [1, -1], [2, 2], [3, 0], [4, 0], [5, -1], [6, 1], [7, 0]
]
dp = DoPeR(data=data_original)
# use tolerance threshold (i.e. max. error w.r.t. normalized data)
data_simplified_eps = dp.simplify(tolerance=0.2)
# compare original data and simplified data in a plot
dp.plot()
# or use maximum recursion depth
data_simplified_depth = dp.simplify(max_depth=2)
Also see examples in tests.
Currently we only offer a recursive implementation (depth-first), which is intuitive, but may not be the most efficient solution. An iterative implementation is in the works (breadth-first).