Helper library to experiment with HTML fingerprinting.
- Free software: MIT license
- Documentation: https://sketchtml.readthedocs.io.
- TODO
- Locality Sensitive Hashing for Scalable Structural Classification and Clustering of Web Documents (2013)
- Enforcing k-anonymity in Web Mail Auditing -- Mail-Hash (2016) (patent)
- Structural Clustering of Machine-Generated Mail (2016)
- Web-Scale Information Extraction with Vertex (2011)
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.