A Zeebe text classifier worker based on Hugging Face NLP pipeline
Set a virtual python environment for version 3.7.10 and install requirements:
pip install -r requirements.txt
Specify a local (after downloading under models folder) or an Hugging Face token classification model in the .env file
Due to high resource consumption of some models, we decided to make this worker configurable in term of task name and associated model. For example, so it is possible to separate tasks and models with multiple workers for language handling :
- task
ner-en
and model (default will be downloaded at worker startup from Hugging Face's website) - task
ner-fr
and local modelmodels/camembert-ner-with-dates
If you have a local/docker-compose Zeebe running locally you can run/debug with:
python index.py
python -m unittest
docker build -t docker.pkg.github.com/teode/zeebe-ner-worker/zeebe-ner-french-worker:v1.0.0 -f Dockerfile.fr .
docker push docker.pkg.github.com/teode/zeebe-ner-worker/zeebe-ner-french-worker:v1.0.0
Else get it from Docker hub:
Or download from: https://hub.docker.com/r/teode/zeebe-ner-french-worker
You must have a local or a port-forwarded Zeebe gateway for the worker to connect then:
docker run --name zb-ner-fr-wkr zeebe-ner-french-worker
Example BPMN with service task:
<bpmn:serviceTask id="my-ner" name="My NER">
<bpmn:extensionElements>
<zeebe:taskDefinition type="my-env-var-task-name" />
</bpmn:extensionElements>
</bpmn:serviceTask>
- the worker is registered for the type of your choice (set as an env var)
- required variables:
sequence
- the phrase to classify
- jobs are completed with an
entities
object containing serialized properties:entity_group
- ORG (organization), DATE, PER (firstname and/or name), LOC (location)score
- the confidence scoreword
- the word extracted from the sequencestart
- the position in the first letterend
- the position in the last letter