Skip to content

UDPipe 2.0.0

Compare
Choose a tag to compare
@foxik foxik released this 05 Aug 10:49
· 602 commits to master since this release

Compared to UDPipe 1:

  • UDPipe 2 is Python-only and tested only in Linux,
  • UDPipe 2 is meant as a research tool, not as a user-friendly UDPipe 1 replacement,
  • UDPipe 2 achieves much better performance, but requires a GPU for reasonable performance,
  • UDPipe 2 does not perform tokenization by itself – it uses UDPipe 1 for that.

UDPipe 2 is available as a REST service running at https://lindat.mff.cuni.cz/services/udpipe. If you like, you can use the udpipe2_client.py script to interact with it.

However, if you prefer to run UDPipe 2 locally, you can use this release.

Running Inference with Existing Models

To run UDPipe 2, you need to first download a model from the list of UDPipe 2 models. Then you can run UDPipe 2 as a local REST server, and use the udpipe2_client.py script to interact with it (in the same way as with the official service).

To run the server, use the udpipe2_server.py script.

  • Install the requirements.txt. While only TF 1 is supported for model training (ancient, I know), you can use also TF 2 for inference.
  • The script has the following required options:
    • port: the port to listen on. We use SO_REUSEPORT to allow multiple processes to run concurrently, supporting seamless upgrades;
    • default_model: model name to use when no model is specified in the request;
    • models: each model is then a quadruple of the following parameters (each published model contains a file MODEL.txt with these parameters):
      • model names: any number of model names separated by :; furthermore, any hyphen-separated prefix of any model name can be also used as a name (e.g., czech-pdt-ud-2.10-220711:cs_pdt-ud-2.10-220711:cs:ces:cze);
      • model path: path to the model directory;
      • treebank name: because multiple treebanks can be handled by a single model, we need to specify a treebank name to use (this also specifies which tokenizer to use from the model directory);
      • acknowledgements: a URL to the model's acknowledgements.
  • The script has the following optional parameters:
    • --batch_size: batch size to use (default 32);
    • --logfile: if specified, log to this file instead of standard error;
    • --max_request_size: maximum request size, in bytes (default 4MB);
    • --preload_models: list of models to preload (or all) immediately after start (default none);
    • --threads: number of threads to use (default is to use all physical cores);
    • --wembedding_server: for deployment purposes, it might be useful to compute the contextualized embeddings (mBERT, RobeCzech) not in the UDPipe 2 service, but in a specialized service – see https://github.com/ufal/wembedding_service for documentation of the wembeddings service (default is to compute the embeddings directly in the UDPipe 2 service).

The service can be stopped by a SIGINT (Ctrl+C) signal or by a SIGUSR1 signal. Once such a signal is received, the service stops accepting new requests, but waits until all existing connections are handled and closed.

The models are loaded on-demand, but they are never freed. If a GPU is available, then all computation is performed on it (and an OOM might occur if too many models are loaded). If you would like to run BERT on a GPU and the remaining computation on a CPU, you could use GPU-enabled wembeddings service plus a CPU-only UDPipe 2 service.