Skip to content
SB2020-eye edited this page Feb 7, 2021 · 2 revisions

Using OCR-D on Windows 10

OCR-D can be used on Windows 10 with one of the following methods:

  • Run OCR-D tools on Linux in a virtual machine
  • Run OCR-D tools in Docker
  • Run OCR-D tools from Windows Subsystem for Linux (WSL)

This documentation has a focus on the last method.

Preconditions

A recent installation of Windows 10 with 15 GiB or more free disk space is required. For good user experience we suggest a modern PC with at least 8 GiB RAM.

Using graphic cards (GPU) for faster processing of some tools is still untested, but is expected not to work. Luckily many tools don't require a GPU.

Running OCR-D tools from Windows Subsystem for Linux

The Windows Subsystem for Linux (WSL) provides a Linux environment which works under Windows 10.

Installation of WSL and Linux

Install WSL first. There are two variants of WSL, namely WSL1 and the newer WSL2.

Up to now, only WSL1 was tested with OCR-D, so this is the recommended installation.

Installation guides are available from Microsoft:

Then choose a Linux distribution in the Microsoft Store. The installation was tested with Debian stable and Ubuntu 18.04 LTS, therefore it is suggested to use one of those distributions.

Installation of the Linux distribution might ask for a Microsoft user account, but it should be possible without such an account, too.

Running a shell (command line)

After the installation of WSL and the Linux distribution, a shell can be started either from the Windows start menu or from the Windows command line (cmd, powershell) with debian or ubuntu, depending on the installed Linux distribution.

The installation of the OCR-D tools and also their usage always starts from a shell. Make sure that you know the Linux password which was created during the installation of the Linux distribution. The sudo commands which are used for the installation require that password.

Installation of OCR-D tools

The installation uses OCR-D/ocrd_all and is basically the same as on a native Linux distribution. Start with these commands from the shell:

# Update the package list and upgrade the installed packages.
sudo apt update
sudo apt upgrade

# Install required packages.
sudo apt install --no-install-recommends ca-certificates git make

# Optional installation for Tesseract training tools, suggested.
sudo apt install libpango1.0-dev

# Clean the package cache to save disk space.
sudo apt clean

Then get OCR-D/ocrd_all, install more required packages and build the OCR-D tools. Here are the necessary commands:

mkdir -p $HOME/src/github/OCR-D
cd $HOME/src/github/OCR-D

# Get OCRD/ocrd_all.
git clone https://github.com/OCR-D/ocrd_all.git

cd $HOME/src/github/OCR-D/ocrd_all

# Install more required packages.
# This installation can take about 20 minutes.
sudo make deps-ubuntu
# Clean the package cache to save disk space.
sudo apt clean

# Build the OCR-D tools.
# This takes from 20 minutes (modern fast desktop PC) to 60 minutes (older desktop PC).
make all

Each command should work without showing an error message. Activate the virtual Python environment with all OCR-D tools:

source $HOME/src/github/OCR-D/ocrd_all/venv/bin/activate

Now you are ready to run the OCR-D tools. Try to run one of them:

ocrd --help

Congratulations if that works. You are now ready to use the OCR-D tools. Each time when you open a new shell and want to work with OCR-D tools, you must activate the virtual Python environment again:

source $HOME/src/github/OCR-D/ocrd_all/venv/bin/activate

Handling of typical problems

If make all terminates with an error message, this can be caused by missing free disk space. Example:

Successfully built ocrd-calamari
Installing collected packages: tensorflow-gpu, ocrd-calamari
ERROR: Could not install packages due to an EnvironmentError: [Errno 28] No space left on device

The build process uses several GiB for temporary files, so make sure that enough free disk space is available.

Missing dependencies can also cause a build failure, so don't forget to run sudo make deps-ubuntu before running make all.

If make all still shows error messages, you can try make all -k. This will create as many OCR-D tools as possible, even if it fails to build some of them.

Required disk space

The following values are only rough estimations, so individual installations might show different values.

  • 366 MiB WSL1 with Debian stable, no additional packages.
  • 55 MiB packages for git and make
  • 5.1 GiB for OCR-D/ocrd_all
  • more than 10 GiB for some installation steps (only temporarily used)

Welcome to the OCR-D wiki, a companion to the OCR-D website.

Articles and tutorials
Discussions
Expert section on OCR-D- workflows
Particular workflow steps
Recommended workflows
Workflow Guide
Videos
Section on Ground Truth
Clone this wiki locally