Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when running marker in docker-compose #192

Open
sogand145 opened this issue Jun 14, 2024 · 3 comments
Open

Error when running marker in docker-compose #192

sogand145 opened this issue Jun 14, 2024 · 3 comments

Comments

@sogand145
Copy link

sogand145 commented Jun 14, 2024

Hi,
I'm using docker-compose to use marker in the container, but I get this error:

error-in-marker

and this is dockerfile:
`FROM python:3.9-bullseye

RUN apt-get update && apt-get upgrade -y

RUN apt install build-essential libpoppler-cpp-dev pkg-config python3-dev openjdk-11-jdk ghostscript ocrmypdf -y

ENV JAVA_HOME /usr/lib/jvm/java-11-openjdk-amd64
ENV OCR_ENGINE=ocrmypdf
ENV TORCH_DEVICE=cpu

RUN pip install marker-pdf ocrmypdf

WORKDIR /app

COPY ./by_marker-pdf /app/

CMD ["/bin/bash"]`

I should say there is no files in "marker" folder, I don't know how to change default values in settings.py
I want to use ocrmypdf engine and also use cpu instead of gpu, how can I change default values?

Thanks in advance

@mdoughty-tagleaf
Copy link

I am experiencing the same issue of the process being killed upon bounding box detection with the following Dockerfile

FROM bitnami/pytorch

USER root

# Update container
RUN apt-get update
RUN apt-get upgrade -y

# Open GL
RUN apt-get install -y \
    libgl1-mesa-glx \
    libglib2.0-0
RUN rm -rf /var/lib/apt/lists/*

USER 1001

# Marker
RUN pip install marker-pdf

and Docker compose YAML

services:
  pdf-service:
    build:
      context: .
      dockerfile: build/PdfService.dockerfile
    tty: true
    ports:
      - "80:8484"
    volumes:
      - <pwd>/cache:/app/cache
      - <pwd>/src:/app/src
      - <pwd>/out:/app/out
      - ${HOME}/Downloads:/app/storage
    environment:
      - HF_HOME=/app/cache
      - HOME=/app/cache

and attempting a single file parse in the container.

marker_single ./storage/<file> ./out

@mdoughty-tagleaf
Copy link

@sogand145, I have managed to get past this being killed business. This is a RAM-intensive tool, so the solution is to significantly increase the resources available to Docker. I cracked it open to 13.5GB RAM in the Docker Desktop settings, and added the following to my compose YAML. Now it successfully detects a few bounding boxes before encountering a new and exciting error 🫠

    deploy:
      resources:
        limits:
          memory: 12G
          cpus: '6'

@ujconsulting
Copy link

same problem here. running in windows server 2022, WSL2 Ubuntu Environment, memory limitations should not be an issue because its limited to 64GB per machine...

make_single blows the wsl up to more than 32GB of used memory with a 7.6MB PDF file with 500 pages an then its killed.
240806_marker_error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants