Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to know a textline belongs to which page in a docx file #478

Closed
hoangthanh283 opened this issue Oct 13, 2022 · 2 comments
Closed

How to know a textline belongs to which page in a docx file #478

hoangthanh283 opened this issue Oct 13, 2022 · 2 comments

Comments

@hoangthanh283
Copy link

hoangthanh283 commented Oct 13, 2022

Description

Is there any way to know a text line belongs to which page in a .docx file?

Expected Behavior

We should have an attribute in a text Item to get the page information:

for ei, e := range extracted.Items {
      text: = e.Text`
      page_index = e.PageIndex

Actual Behavior

There is only Text, DrawingInfo, Paragraph, Hyperlink, TableInfo BUT no PageInfo in the TextItem.

Please include a reproducible code snippet or document attachment that
demonstrates the issue.

@github-actions
Copy link

Welcome! Thanks for posting your first issue. The way things work here is that while customer issues are prioritized,
other issues go into our backlog where they are assessed and fitted into the roadmap when suitable.
If you need to get this done, consider buying a license which also enables you to use it in your commercial products.
More information can be found on https://unidoc.io/

@anovik
Copy link

anovik commented Dec 27, 2024

This is not possible for docx format and there is no reliable way to do that. Docx (OOXML) doesn't contain any page numbers related to paragraphs, lines, etc. In MS Word, page numbers for items are defined only during rendering.

@anovik anovik closed this as completed Dec 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants