Skip to content

Commit

Permalink
feat: ✨ Official docker images for docTR (mindee#1322)
Browse files Browse the repository at this point in the history
  • Loading branch information
odulcy-mindee authored Sep 28, 2023
1 parent 19f101c commit 69f6705
Show file tree
Hide file tree
Showing 4 changed files with 198 additions and 21 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/docker.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,9 @@ jobs:
steps:
- uses: actions/checkout@v3
- name: Build docker image
run: docker build . -t doctr-tf-py3.8-slim
run: docker build -t doctr-tf-py3.8-slim --build-arg SYSTEM=cpu .
- name: Run docker container
run: docker run doctr-tf-py3.8-slim python -c 'import doctr'
run: docker run doctr-tf-py3.8-slim python3 -c 'import doctr'

pytest-api:
runs-on: ${{ matrix.os }}
Expand Down
86 changes: 86 additions & 0 deletions .github/workflows/public_docker_images.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
# https://docs.github.com/en/actions/publishing-packages/publishing-docker-images#publishing-images-to-github-packages
#
name: Docker image on ghcr.io

on:
push:
tags:
- 'v*'
pull_request:
branches: main
schedule:
- cron: '0 2 29 * *' # At 02:00 on day-of-month 29

env:
REGISTRY: ghcr.io

jobs:
build-and-push-image:
runs-on: ubuntu-latest

strategy:
fail-fast: false
matrix:
# Must match version at https://www.python.org/ftp/python/
python: ["3.8.18", "3.9.18", "3.10.13"]
framework: ["tf", "torch"]
system: ["cpu", "gpu"]

# Sets the permissions granted to the `GITHUB_TOKEN` for the actions in this job.
permissions:
contents: read
packages: write

steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Log in to the Container registry
uses: docker/login-action@65b78e6e13532edd9afa3aa52ac7964289d1a9c1
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Extract metadata (tags, labels) for Docker
id: meta
uses: docker/metadata-action@9ec57ed1fcdbf14dcef7dfbe97b2010124a938b7
with:
images: ${{ env.REGISTRY }}/${{ github.repository }}
tags: |
# used only on schedule event
type=schedule,pattern={{date 'YYYY-MM'}},prefix=${{ matrix.framework }}-py${{ matrix.python }}-${{ matrix.system }}-
# used only if a tag following semver is published
type=semver,pattern={{raw}},prefix=${{ matrix.framework }}-py${{ matrix.python }}-${{ matrix.system }}-
- name: Build Docker image
id: build
uses: docker/build-push-action@f2a1d5e99d037542a71f64918e516c093c6f3fc4
with:
context: .
build-args: |
FRAMEWORK=${{ matrix.framework }}
PYTHON_VERSION=${{ matrix.python }}
SYSTEM=${{ matrix.system }}
DOCTR_REPO=${{ github.repository }}
DOCTR_VERSION=${{ github.sha }}
push: false # push only if `import doctr` works
tags: ${{ steps.meta.outputs.tags }}

- name: Check if `import doctr` works
run: docker run ${{ steps.build.outputs.imageid }} python3 -c 'import doctr'

- name: Push Docker image
# Push only if the CI is not triggered by "PR on main"
if: github.ref == 'refs/heads/main' && github.event_name != 'pull_request'
uses: docker/build-push-action@f2a1d5e99d037542a71f64918e516c093c6f3fc4
with:
context: .
build-args: |
FRAMEWORK=${{ matrix.framework }}
PYTHON_VERSION=${{ matrix.python }}
SYSTEM=${{ matrix.system }}
DOCTR_REPO=${{ github.repository }}
DOCTR_VERSION=${{ github.sha }}
push: true
tags: ${{ steps.meta.outputs.tags }}
83 changes: 67 additions & 16 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,21 +1,72 @@
# Use the TensorFlow GPU image as the base image. This image also works with CPU-only setups
FROM tensorflow/tensorflow@sha256:b4676741c491bff3d0f29c38c369281792c7d5c5bfa2b1aa93e5231a8d236323
FROM ubuntu:22.04

ENV PYTHONUNBUFFERED=1
ENV PYTHONDONTWRITEBYTECODE=1
ENV DOCTR_CACHE_DIR=/app/.cache
ENV DEBIAN_FRONTEND=noninteractive
ENV LANG=C.UTF-8
ENV PYTHONUNBUFFERED=1
ENV PYTHONDONTWRITEBYTECODE=1

WORKDIR /app
ARG SYSTEM=gpu

COPY . .
# Enroll NVIDIA GPG public key and install CUDA
RUN if [ "$SYSTEM" = "gpu" ]; then \
apt-get update && \
apt-get install -y gnupg ca-certificates wget && \
# - Install Nvidia repo keys
# - See: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#network-repo-installation-for-ubuntu
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb && \
dpkg -i cuda-keyring_1.1-1_all.deb && \
apt-get update && apt-get install -y --no-install-recommends \
cuda-command-line-tools-11-8 \
cuda-cudart-dev-11-8 \
cuda-nvcc-11-8 \
cuda-cupti-11-8 \
cuda-nvprune-11-8 \
cuda-libraries-11-8 \
cuda-nvrtc-11-8 \
libcufft-11-8 \
libcurand-11-8 \
libcusolver-11-8 \
libcusparse-11-8 \
libcublas-11-8 \
# - CuDNN: https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html#ubuntu-network-installation
libcudnn8=8.6.0.163-1+cuda11.8 \
libnvinfer-plugin8=8.6.1.6-1+cuda11.8 \
libnvinfer8=8.6.1.6-1+cuda11.8; \
fi

# Install necessary dependencies for video processing and GUI operations
RUN apt-get update \
&& apt-get install --no-install-recommends ffmpeg libsm6 libxext6 -y \
&& apt-get autoremove -y \
&& rm -rf /var/lib/apt/lists/*
RUN apt-get update && apt-get install -y --no-install-recommends \
# - Other packages
build-essential \
pkg-config \
curl \
wget \
software-properties-common \
unzip \
git \
# - Packages to build Python
tar make gcc zlib1g-dev libffi-dev libssl-dev liblzma-dev libbz2-dev \
# - Packages for docTR
libgl1-mesa-dev libsm6 libxext6 libxrender-dev libpangocairo-1.0-0 \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* \
fi

# Install the current application with TensorFlow extras and modify permissions
RUN pip install --upgrade pip setuptools wheel \
&& pip install -e .[tf] \
&& chmod -R a+w /app
# Install Python
ARG PYTHON_VERSION=3.10.13

RUN wget http://www.python.org/ftp/python/$PYTHON_VERSION/Python-$PYTHON_VERSION.tgz && \
tar -zxf Python-$PYTHON_VERSION.tgz && \
cd Python-$PYTHON_VERSION && \
mkdir /opt/python/ && \
./configure --prefix=/opt/python && \
make && \
make install

ENV PATH=/opt/python/bin:$PATH

# Install docTR
ARG FRAMEWORK=tf
ARG DOCTR_REPO='mindee/doctr'
ARG DOCTR_VERSION=main
RUN pip3 install -U pip setuptools wheel && \
pip3 install "python-doctr[$FRAMEWORK]@git+https://github.com/$DOCTR_REPO.git@$DOCTR_VERSION"
46 changes: 43 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
<img src="docs/images/Logo_doctr.gif" width="40%">
</p>

[![Slack Icon](https://img.shields.io/badge/Slack-Community-4A154B?style=flat-square&logo=slack&logoColor=white)](https://slack.mindee.com) [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE) ![Build Status](https://github.com/mindee/doctr/workflows/builds/badge.svg) [![codecov](https://codecov.io/gh/mindee/doctr/branch/main/graph/badge.svg?token=577MO567NM)](https://codecov.io/gh/mindee/doctr) [![CodeFactor](https://www.codefactor.io/repository/github/mindee/doctr/badge?s=bae07db86bb079ce9d6542315b8c6e70fa708a7e)](https://www.codefactor.io/repository/github/mindee/doctr) [![Codacy Badge](https://api.codacy.com/project/badge/Grade/340a76749b634586a498e1c0ab998f08)](https://app.codacy.com/gh/mindee/doctr?utm_source=github.com&utm_medium=referral&utm_content=mindee/doctr&utm_campaign=Badge_Grade) [![Doc Status](https://github.com/mindee/doctr/workflows/doc-status/badge.svg)](https://mindee.github.io/doctr) [![Pypi](https://img.shields.io/badge/pypi-v0.7.0-blue.svg)](https://pypi.org/project/python-doctr/) [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/mindee/doctr) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mindee/notebooks/blob/main/doctr/quicktour.ipynb)
[![Slack Icon](https://img.shields.io/badge/Slack-Community-4A154B?style=flat-square&logo=slack&logoColor=white)](https://slack.mindee.com) [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE) ![Build Status](https://github.com/mindee/doctr/workflows/builds/badge.svg) [![Docker Images](https://img.shields.io/badge/Docker-4287f5?style=flat&logo=docker&logoColor=white)](https://github.com/mindee.doctr/packages) [![codecov](https://codecov.io/gh/mindee/doctr/branch/main/graph/badge.svg?token=577MO567NM)](https://codecov.io/gh/mindee/doctr) [![CodeFactor](https://www.codefactor.io/repository/github/mindee/doctr/badge?s=bae07db86bb079ce9d6542315b8c6e70fa708a7e)](https://www.codefactor.io/repository/github/mindee/doctr) [![Codacy Badge](https://api.codacy.com/project/badge/Grade/340a76749b634586a498e1c0ab998f08)](https://app.codacy.com/gh/mindee/doctr?utm_source=github.com&utm_medium=referral&utm_content=mindee/doctr&utm_campaign=Badge_Grade) [![Doc Status](https://github.com/mindee/doctr/workflows/doc-status/badge.svg)](https://mindee.github.io/doctr) [![Pypi](https://img.shields.io/badge/pypi-v0.7.0-blue.svg)](https://pypi.org/project/python-doctr/) [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/mindee/doctr) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mindee/notebooks/blob/main/doctr/quicktour.ipynb)


**Optical Character Recognition made seamless & accessible to anyone, powered by TensorFlow 2 & PyTorch**
Expand Down Expand Up @@ -260,10 +260,50 @@ Check out our [TensorFlow.js demo](https://github.com/mindee/doctr-tfjs-demo) to

### Docker container

If you wish to deploy containerized environments, you can use the provided Dockerfile to build a docker image:
[We offers Docker container support for easy testing and deployment](https://github.com/mindee/doctr/packages).

#### Using GPU with docTR Docker Images

The docTR Docker images are GPU-ready and based on CUDA `11.8`.
However, to use GPU support with these Docker images, please ensure that Docker is configured to use your GPU.

To verify and configure GPU support for Docker, please follow the instructions provided in the [NVIDIA Container Toolkit Installation Guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html).

Once Docker is configured to use GPUs, you can run docTR Docker containers with GPU support:

```shell
docker run --it --gpus all ghcr.io/mindee/doctr:tf-py3.8.18-gpu-2023-09 bash
```

#### Available Tags

The Docker images for docTR follow a specific tag nomenclature: `<framework>-py<python_version>-<system>-<doctr_version|YYYY-MM>`. Here's a breakdown of the tag structure:

- `<framework>`: `tf` (TensorFlow) or `torch` (PyTorch).
- `<python_version>`: `3.8.18`, `3.9.18`, or `3.10.13`.
- `<system>`: `cpu` or `gpu`
- `<doctr_version>`: a tag >= `v0.7.1`
- `<YYYY-MM>`: e.g. `2023-09`

Here are examples of different image tags:

| Tag | Description |
|----------------------------|---------------------------------------------------|
| `tf-py3.8.18-cpu-v0.7.1` | TensorFlow version `3.8.18` with docTR `v0.7.1`. |
| `torch-py3.9.18-gpu-2023-09`| PyTorch version `3.9.18` with GPU support and a monthly build from `2023-09`. |

#### Building Docker Images Locally

You can also build docTR Docker images locally on your computer.

```shell
docker build -t doctr .
```

You can specify custom Python versions and docTR versions using build arguments. For example, to build a docTR image with TensorFlow, Python version `3.9.10`, and docTR version `v0.7.0`, run the following command:

```shell
docker build . -t <YOUR_IMAGE_TAG>
docker build -t doctr --build-arg FRAMEWORK=tf --build-arg PYTHON_VERSION=3.9.10 --build-arg DOCTR_VERSION=v0.7.0 .
```

### Example script
Expand Down

0 comments on commit 69f6705

Please sign in to comment.