AttributeError: 'Document' object has no attribute 'pageCount' #8972

I tried pymupdf==1.19.0，and it required 1.20.0. I installed pymupdf==1.20.0 and 1.21.0. AttributeError: 'Document' object has no attribute 'pageCount'. There is no way to deal with pdf files.

1 reply

yonglee7015 Jun 21, 2023

Are you sure it required 1.20.0? version must be lower then 1.20.0. Cos pageCount only for version lower than 1.20.0

savikko · 2023-06-28T16:43:56Z

savikko
Jun 28, 2023

One solution which seems to work:

Edit directly ppocr/utils/utility.py

From line 93->

            for pg in range(0, pdf.page_count):
                page = pdf[pg]
                mat = fitz.Matrix(2, 2)
                pm = page.get_pixmap(matrix=mat, alpha=False)

                # if width or height > 2000 pixels, don't enlarge the image
                if pm.width > 2000 or pm.height > 2000:
                    pm = page.get_pixmap(matrix=fitz.Matrix(1, 1), alpha=False)

So change camelCases to snake_case:

pageCount -> page_count
getPixmap -> get_pixmap

2 replies

bolongliu Sep 27, 2023

thanks , it`s usefull.

NNNorman Jan 28, 2024

2.7.1版的pdf2word.py 也有这个问题，105行开始按上述修改，即：
pageCount -> page_count
getPixmap -> get_pixmap

2.7.1 version: the same problem occure in pdf2word.py
So change from line 105:
pageCount -> page_count
getPixmap -> get_pixmap

vlavorini · 2024-04-09T10:39:31Z

vlavorini
Apr 9, 2024

Please also update the documentation

https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/doc/doc_en/quickstart_en.md

0 replies

Jacquelin803 · 2024-05-16T07:26:38Z

Jacquelin803
May 16, 2024

pdfDoc = fitz.open(pdfPath)
for pg in range(pdfDoc.page_count):
page = pdfDoc[pg]
# rotate = int(0)
zoom_x = 4 # (1.33333333-->1056x816) (2-->1584x1224)
zoom_y = 4
mat = fitz.Matrix(zoom_x, zoom_y)
pix = page.get_pixmap(matrix=mat, alpha=False)
if not os.path.exists(imagePath):
os.makedirs(imagePath)
pix.save(imagePath + '/' + 'images_%s.png' % pg)

0 replies

TanishqSharma2022 · 2025-01-28T17:35:17Z

TanishqSharma2022
Jan 28, 2025

While executing the PDF code, I got the following error:

AttributeError: 'Document' object has no attribute 'pageCount'

I found out the bug is just present in the documentation. The libraries have updated it. In the code just replace pageCount to page_count and getPixmap to get_pixmap

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AttributeError: 'Document' object has no attribute 'pageCount' #8972

{{title}}

Replies: 7 comments 6 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

AttributeError: 'Document' object has no attribute 'pageCount' #8972

Replies: 7 comments · 6 replies

LDOUBLEV Mar 21, 2023 Collaborator

Replies: 7 comments 6 replies

LDOUBLEV
Mar 21, 2023
Collaborator