Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Outline-Iterator: add style and color information to OutlineItem #123

Open
HeimMatthias opened this issue Dec 10, 2024 · 1 comment
Open

Comments

@HeimMatthias
Copy link

PDF Outline Items can have an RGB color encoding and be bold and/or italicized. This information cannot be get / set via the OutlineItem interface and should be added as one of the properties of this object, so that it can be read via document.loadOutline(); and outlineIterator.item();. and written via outlineIterator.insert(item); and outlineIterator.update(item);.

I propose to keep the new interface properties as close to the Adobe implementation as possible, so
.color (https://opensource.adobe.com/dc-acrobat-sdk-docs/library/jsapiref/JS_API_AcroJS.html#color), would be a color array with the color space before it. But I understand that MuPDF handles colors differently, and the implementation might have to mirror that behaviour (https://mupdfjs.readthedocs.io/en/latest/glossary/index.html#colors) (though accessing outline items is usually device-independent, so I'm not sure how that would work in practice). For all intents and purposes an RGB-implementation would suffice, since the PDF-standard (PDF32000_2008, p. 369) only allows DeviceRGB anyway, but the length of the color-array itself could be used to define the color space.
.style, where "0 is normal, 1 is italic, 2 is bold, and 3 is bold-italic" (https://opensource.adobe.com/dc-acrobat-sdk-docs/library/jsapiref/JS_API_AcroJS.html#id247)

I understand that this information can already be retrieved or set via Object access, using the C and F entries of the corresponding pdf object, but this is more complex than it should be. I have implemented my own solution in this gist: https://gist.github.com/HeimMatthias/f628b8dd2ed9062e72cea544fffb3f40
Because my script's creates an Outline-Iterator and traverses the entire outline-tree every time to access the outline-item's corresponding object, however, it tends to fail with many complex documents. But this method works fine without the abstraction (i.e. if I follow the outline-objects from the trailer while parsing the outline).

@jamie-lemon
Copy link
Collaborator

@HeimMatthias Thanks for this - this looks really interesting! I'm going to test the functionality in the gist you supplied and get back to you. If all good I think we could then hopefully integrate this formally via a PR. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants