-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Abrubt Termination (Without any error) on Google Colab, AWS EC2 #270
Comments
what was the file size? how many pages? i can be that the instances runs out of memory. |
The file size is 15.6 MB. The conversion worked fine on vast.ai's jupyter lab instance with RTX 4090 (24 GB VRAM) and 32 GB RAM. I had used --batch_multiplier 7 here. Apart from the memory required for the batches (which is ~3GB per batch), I had assumed that a minimal memory will be required by the program that would be constant regardless of the pdf size. Is it not the case? |
The vram is limited and will not go up with page size, but you ram will. A workaround would be slicing your pdf with PyMuPDF in smaller batches and merging the results. |
All right |
The conversion process abruptly terminates at random intervals in the Detecting Boxes Stage on Google Colab and AWS EC2 instance (WIndows). The percentage value varies randomly.
AWS EC2
Google Colab
Document size is 198 pages with a mixture of selectable text, scanned text, screenshots of certificates, tables, scanned images of printed tables, etc.
The text was updated successfully, but these errors were encountered: