JPG to PDF converter

A framework made for converting an image file to PDF file.
Usage of just 4 packages of Python.
For latest code on how to convert multiple image files to pdf, check multiple-files branch.

Working

First, this will seacrh for the text inside the image file(most probably .jpg/.PNG file).

Then, it'll fetch the text & store it in a variable.

And then, writes it into a docx file(So that if it extracts wrong text, you can manually type the correct text).

And finally, convert the word file hence created into a pdf file.

Requirements

You need to download tesseract.
You need following packages of python:
1. PIL
2. pytesseract
3. docx
4. docx2pdf
IDLE (PyCharm Community Edition preferrable)

How to run

Go to main file.
Just check the following lines in the code:

text = extractTextFromImg(imgPath) # - extract text from image file
appendDataToWord(wordPath, text) # - append data into word file
convertDocxToPdf(wordPath, pdfPath) # - convert it into pdf

You don't need to change anything in this file. The changes which you need to make is in variables file.
Just change the value of following 4 variables:
1. tesseractPath # the tesseract path you just downloaded. The one written in the file is default path
2. wordPath # the word file path. Put the word file path under wordFiles directory for better readability
3. pdfPath # the pdf file path. Put the pdf file path under pdfFiles directory for better readability

Author


Nitin Kumar

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.MD

README.MD

JPG to PDF converter

Working

Requirements

How to run

Author

Files

README.MD

Latest commit

History

README.MD

File metadata and controls

JPG to PDF converter

Working

Requirements

How to run

Author