Make the data ("Stillstandsprotokolle des 17. Jahrhunderts") better searchable and georeference it for visualization.
- Ernst Rosser, [email protected]
- Barbara Leimgruber, [email protected]
- Rebekka Plüss, [email protected]
- Ismail Prada, [email protected]
- Matthias Mazenauer, [email protected]
- Tobias Hodel, [email protected]
Primary Data
Secondary data
- Create lookup for normalized strings (https://github.com/mmznr/Staatsarchiv-GLAMhack/blob/master/woerterStillstand_Result.tsv)
- Annotate named entities (normalization) -> places (also add BfS-data) -> persons (normalization to be used for auto-complete in search)
- Cluster words -> based on "Frequenztabelle Stillstandsprotokolle", see https://github.com/mmznr/Staatsarchiv-GLAMhack/blob/master/README.md#frequency-list-of-word-cluster -> to be used to refer to topic/concept
- Cluster documents -> to be used as keyword(s) in TEI header = Scripts for clustering, see folder "code"
- Create script to add information as tags (in body) to write in XML (in work)
Done: Wordlist and Frequencies
ToDo: POS tagging
Names of persons: done A-D
Names of places: done A-K
(using fasttext) https://github.com/mmznr/Staatsarchiv-GLAMhack/tree/master/Visualisierungen/clusters.png https://github.com/mmznr/Staatsarchiv-GLAMhack/tree/master/Visualisierungen/clusters2.png
Done: Borders from swisstopo via Linked Data, Matching of the settlements of the canton of Zurich
ToDo: Get List of old names of this settlements, match them and show all relating documents of a settlement (or municipality)