Test implementation of an algorithm described in the paper "Sublinear Algorithms for Approximating String Compressibility" (https://arxiv.org/abs/0706.1084).
(Serves as part of a computer science MSc thesis related to the subject.)
Install:
pip3 install -r requirements.txt
Use:
python3 run.py -A 18 -e 0.1 -i myfile
The dsstools
directory contains some utilities for counting the number of distinct substrings of different lengths in a file, also related to the aforementioned paper.