Release UDPipe 2.0.0 · ufal/udpipe

Compared to UDPipe 1:

UDPipe 2 is Python-only and tested only in Linux,
UDPipe 2 is meant as a research tool, not as a user-friendly UDPipe 1 replacement,
UDPipe 2 achieves much better performance, but requires a GPU for reasonable performance,
UDPipe 2 does not perform tokenization by itself – it uses UDPipe 1 for that.

UDPipe 2 is available as a REST service running at https://lindat.mff.cuni.cz/services/udpipe. If you like, you can use the udpipe2_client.py script to interact with it.

However, if you prefer to run UDPipe 2 locally, you can use this release.

Running Inference with Existing Models

To run UDPipe 2, you need to first download a model from the list of UDPipe 2 models. Then you can run UDPipe 2 as a local REST server, and use the udpipe2_client.py script to interact with it (in the same way as with the official service).

To run the server, use the udpipe2_server.py script.

Install the requirements.txt. While only TF 1 is supported for model training (ancient, I know), you can use also TF 2 for inference.
The script has the following required options:
- port: the port to listen on. We use SO_REUSEPORT to allow multiple processes to run concurrently, supporting seamless upgrades;
- default_model: model name to use when no model is specified in the request;
- models: each model is then a quadruple of the following parameters (each published model contains a file MODEL.txt with these parameters):
  - model names: any number of model names separated by :; furthermore, any hyphen-separated prefix of any model name can be also used as a name (e.g., czech-pdt-ud-2.10-220711:cs_pdt-ud-2.10-220711:cs:ces:cze);
  - model path: path to the model directory;
  - treebank name: because multiple treebanks can be handled by a single model, we need to specify a treebank name to use (this also specifies which tokenizer to use from the model directory);
  - acknowledgements: a URL to the model's acknowledgements.
The script has the following optional parameters:
- --batch_size: batch size to use (default 32);
- --logfile: if specified, log to this file instead of standard error;
- --max_request_size: maximum request size, in bytes (default 4MB);
- --preload_models: list of models to preload (or all) immediately after start (default none);
- --threads: number of threads to use (default is to use all physical cores);
- --wembedding_server: for deployment purposes, it might be useful to compute the contextualized embeddings (mBERT, RobeCzech) not in the UDPipe 2 service, but in a specialized service – see https://github.com/ufal/wembedding_service for documentation of the wembeddings service (default is to compute the embeddings directly in the UDPipe 2 service).

The service can be stopped by a SIGINT (Ctrl+C) signal or by a SIGUSR1 signal. Once such a signal is received, the service stops accepting new requests, but waits until all existing connections are handled and closed.

The models are loaded on-demand, but they are never freed. If a GPU is available, then all computation is performed on it (and an OOM might occur if too many models are loaded). If you would like to run BERT on a GPU and the remaining computation on a CPU, you could use GPU-enabled wembeddings service plus a CPU-only UDPipe 2 service.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UDPipe 2.0.0

Running Inference with Existing Models