Retrieving Current CTM When Modifying Existing PDF File #296

ficapy · 2025-02-25T10:00:08Z

First, thank you for developing and maintaining this incredibly useful library. I truly appreciate your work.

Context:
I'm working on redacting specific PDF areas by drawing rectangles over them (coordinates derived from object detection on PDF-converted images). While using PDFHummus to append rectangles, I encountered CTM-related challenges when pages modify the default coordinate system (e.g., changing origin from bottom-left to top-left via initial cm operations).

Issue:
When modifying existing PDFs containing pre-defined CTM transformations (e.g., contentContext->cm(1,0,0,-1,0,792)), subsequent re() operations in modification contexts appear to use the default CTM rather than the active transformation matrix. This causes misaligned rectangles.

Minimal Example:

Base PDF Creation (create_empty.cpp):

#include "PDFWriter.h"
#include "PDFPage.h"
#include "PageContentContext.h"

using namespace PDFHummus;

int main() {
    PDFWriter pdfWriter;
    pdfWriter.StartPDF("demo.pdf", ePDFVersion13);
    PDFPage* page = new PDFPage();
    page->SetMediaBox(PDFRectangle(0, 0, 612, 792));
    
    PageContentContext* contentContext = pdfWriter.StartPageContentContext(page);
    contentContext->cm(1, 0, 0, -1, 0, 792); // Flip Y-axis
    
    pdfWriter.EndPageContentContext(contentContext);
    pdfWriter.WritePageAndRelease(page);
    pdfWriter.EndPDF();
    return 0;
}

Modification Attempt (add_rect.cpp):

#include "PDFWriter.h"
#include "PDFModifiedPage.h"
#include "AbstractContentContext.h"

using namespace PDFHummus;

int main() {
    PDFWriter pdfWriter;
    pdfWriter.ModifyPDF("demo.pdf", ePDFVersion13, "demo_result.pdf");
    
    PDFModifiedPage modifiedPage(&pdfWriter, 0);
    AbstractContentContext* contentContext = modifiedPage.StartContentContext();
    
    contentContext->q();
    contentContext->k(1, 1, 0, 0);  // CMYK fill
    contentContext->re(10, 10, 10, 10); // Expected at bottom-left, appears top-left
    contentContext->f();
    contentContext->Q();
    
    modifiedPage.EndContentContext();
    modifiedPage.WritePage();
    pdfWriter.EndPDF();
    return 0;
}

Current Workaround:
I temporarily inject special text markers (e.g., "DDDDD") via PDFHummus, then parse CTM using pypdf:

from pypdf import PdfReader

def visitor_text(text, ctm, tm, font_dict, font_size):
    print(f"CTM: {ctm}", text)

reader = PdfReader("demo_result.pdf")
page = reader.pages[0]
page.extract_text(visitor_text=visitor_text)  # Extract CTM via dummy text

Request:
Would you be so kind as to suggest a recommended approach or best practice for:

Retrieving the current CTM state when modifying existing PDF pages
Resetting to the base coordinate system before drawing operations

I understand PDF content streams can be complex to parse directly. Is there an API-level method to query transformation states that I might have overlooked?

Thank you very much for your time and assistance. Any insights or suggestions you can provide would be greatly appreciated.

The text was updated successfully, but these errors were encountered:

galkahana · 2025-02-25T16:36:53Z

I can answer in more details RE CTM parsing, but the point is yes - you'd need to interpret the content stream. i'm doing that in a text parser solution of mine and you are welcome to take the relevant code. here's around where i interpret the operators.

as for resetting to the base coordinate system, that's easy. go:
PDFModifiedPage modifiedPage(&pdfWriter, 0, true);
the last parameter is inEnsureContentEncapsulation. if "true" is provided the page code is surrounded with graphic save and restore and then your overlay code will just have the base default matrix to draw with.

ficapy · 2025-02-26T03:54:27Z

Thank you for your assistance. I was able to resolve the issue by using PDFModifiedPage modifiedPage(&pdfWriter, 0, true);.

Additionally, I tried implementing the first approach you mentioned (parsing the final CTM from the content stream), and it works as well. Here's the working implementation I created:

#include <iostream>
#include <string>
#include <vector>
#include <fstream>

#include "PDFParser.h"
#include "PDFDictionary.h"
#include "PDFPage.h"
#include "PDFObjectCast.h"
#include "PDFStreamInput.h"
#include "PDFObject.h"
#include "PDFReal.h"
#include "InputFile.h"
#include "ParsedPrimitiveHelper.h"

#define private public
#include "TextExtraction/lib/graphic-content-parsing/GraphicContentInterpreter.h"
#undef private
#include "TextExtraction/lib/graphic-content-parsing/IGraphicContentInterpreterHandler.h"

using namespace PDFHummus;

class DummyContentHandler : public IGraphicContentInterpreterHandler {
public:
    virtual bool OnTextElementComplete(const TextElement &inTextElement) {
        return true;
    }

    virtual bool OnPathPainted(const PathElement &inPathElement) {
        return true;
    }

    virtual bool OnResourcesRead(const Resources &inResources, IInterpreterContext *inContext) {
        return true;
    }
};

class CTMTrackingInterpreter : public GraphicContentInterpreter {
public:
    CTMTrackingInterpreter() {
        UnitMatrix(mFinalCTM);
    }

    virtual ~CTMTrackingInterpreter() {}

    // Captures CTM state after each operation
    virtual bool OnOperation(const std::string &inOperation,
                             const PDFObjectVector &inOperands,
                             IInterpreterContext *inContext) override {
        bool continueInterpret = GraphicContentInterpreter::OnOperation(inOperation, inOperands, inContext);

        for (int i = 0; i < 6; i++) {
            mFinalCTM[i] = CurrentGraphicState().ctm[i];
        }
        return continueInterpret;
    }

    void GetFinalCTM(double *outCTM) const {
        for (int i = 0; i < 6; i++) {
            outCTM[i] = mFinalCTM[i];
        }
    }

private:
    double mFinalCTM[6];
};

int main() {
    double finalCTM[6];
    InputFile pdfFile;
    pdfFile.OpenFile("demo.pdf");

    PDFParser parser;
    parser.StartPDFParsing(pdfFile.GetInputStream());

    RefCountPtr<PDFDictionary> firstPage = parser.ParsePage(0);

    CTMTrackingInterpreter interpreter;
    DummyContentHandler handler;
    bool interpretOK = interpreter.InterpretPageContents(
        &parser,
        firstPage.GetPtr(),
        &handler
    );
    interpreter.GetFinalCTM(finalCTM);

    std::cout << "Final CTM [ "
              << finalCTM[0] << ", "
              << finalCTM[1] << ", "
              << finalCTM[2] << ", "
              << finalCTM[3] << ", "
              << finalCTM[4] << ", "
              << finalCTM[5] << " ]" << std::endl;
    return 0;
}

galkahana · 2025-02-26T07:51:52Z

Amazing :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Retrieving Current CTM When Modifying Existing PDF File #296

Retrieving Current CTM When Modifying Existing PDF File #296

ficapy commented Feb 25, 2025

galkahana commented Feb 25, 2025

ficapy commented Feb 26, 2025

galkahana commented Feb 26, 2025

Retrieving Current CTM When Modifying Existing PDF File #296

Retrieving Current CTM When Modifying Existing PDF File #296

Comments

ficapy commented Feb 25, 2025

galkahana commented Feb 25, 2025

ficapy commented Feb 26, 2025

galkahana commented Feb 26, 2025