This program was written as a individual project for INST742 - Implementing Digital Curation with Dr. Marciano at University of Maryland's ISchool. It picks up with a dataset we created by using OCR on a 1911 Charlotte city registry from the Internet Archives which was cleaned using open refine. The goal was to map the addresses that were identified with associated metadata (profession, race, etc).
It was created as an instructional tool for information scientists, demonstrating how they could apply similar techniques to their own datasets. The code is integrated into Jupyter Notebooks with descriptions of the process. This allows users to both see the final results and interact with the code for a greater depth of understanding. The python code is compiled into the Charlotte-1911-code.py file.
The easiest way to use the jupyter notebooks is to save them locally and run them in VS Code with the appropriate Jupyter extensions installed.