This library provides algorithms for creating semantic descriptions of mineral-related tables for data extraction. It also integrates with SAND to interactively curate the semantic descriptions.
To work with MinMod KG, we need the ta2-minmod-data, and the default folder structure is:
<DARPA-CRITICALMAAS-DIR>
├── data # for storing databases
├── ta2-minmod-data # ta2-minmod-data repository
└── ta2-table-understanding # ta2-table-understanding repository
To setup the above structure, you can run:
git clone --depth 1 https://github.com/DARPA-CRITICALMAAS/ta2-minmod-data
git clone --depth 1 https://github.com/DARPA-CRITICALMAAS/ta2-minmod-kg
git clone --depth 1 https://github.com/DARPA-CRITICALMAAS/ta2-table-understanding
mkdir data
Note: The folder structure is fully customizable. Please see the Configuration Section for more information.
We use poetry as our package manager (you need to have it installed on your machine first). To install the library and its dependencies, run poetry install
in the root directory of this repository. Then, you can run poetry shell
to activate the virtual environment or use poetry run <command>
to run the commands in the virtual environment.
cd ta2-table-understanding
python -m venv .venv
poetry install
With the working folder structure setup, we can build the necessary databases (entities, ontology classes, and properties) by running poetry run python -m tum.make_db
Check out the demo notebook on how to use the library programmatically.
Alternatively, you can use the SAND UI to interactively load a table, create the semantic description, and extract data from the table.
To install SAND, you can run the following commands:
pip install web-sand sand-drepr
- Setup SAND (run only once):
poetry run python -m sand init -d <DARPA-CRITICALMAAS-DIR>/data/minmod/sand.db
- Start SAND:
poetry run python -m sand start -d <DARPA-CRITICALMAAS-DIR>/data/minmod/sand.db -c <DARPA-CRITICALMAAS-DIR>/ta2-table-understanding/minmod.sand.yml
- The working folder
<DARPA-CRITICALMAAS-DIR>
can be modified by setting the environment variableCRITICAL_MAAS_DIR
. - To customize SAND, you can update the file minmod.sand.yaml
- Training data to the model is stored under data/training_set folder