string_treatment is a library for cleaning and adjusting data with inconsistency.
Install the latest stable version from PyPI:
pip install string-treatment
String treatment:
- treat_referenced
- treat_unreferenced
With reference list
>>> from string_treatment import treat_referenced
>>> list_of_reference = ['João Pessoa/PB']
>>> data_with_inconsistency = ['João Pessoa PB', 'Joao pessoa--PB', 'joa pssoa(pb)']
>>> treat_referenced(data_with_inconsistency, list_of_reference)
['João Pessoa PB', 'João Pessoa PB', 'João Pessoa PB']
Without reference list
>>> from string_treatment import treat_unreferenced
>>> data_with_inconsistency = ['João Pessoa PB', 'Joao pessoa--PB', 'joa pssoa(pb)']
>>> treat_unreferenced(data_with_inconsistency)
['João Pessoa PB', 'João Pessoa PB', 'João Pessoa PB']
To learn about how to use this library and examples, visit the User Guide, which is a Jupyter notebook.