Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checkboxes Failing to Detect State for Valid PDFs #43

Open
TristanHammat-AgilisIT opened this issue Dec 11, 2020 · 4 comments
Open

Checkboxes Failing to Detect State for Valid PDFs #43

TristanHammat-AgilisIT opened this issue Dec 11, 2020 · 4 comments

Comments

@TristanHammat-AgilisIT
Copy link

TristanHammat-AgilisIT commented Dec 11, 2020

The code seems to assume that the checkbox "on" definition is aways first in the "/AP" Appearance Dictionary. This causes the checkbox code to set the wrong state for any PDF where the "off" definition is before the "on" definition.

For example a PDF with this checkbox Appearance Dictionary

/AP 
<<
/D 
<<
/Off 14 0 R
/Yes 15 0 R
>>
/N 
<<
/Off 16 0 R
/Yes 17 0 R

Will generate the following troubling data in the fpdm [infos] => Array:

[checkbox_yes] => Off
[checkbox_no] => Yes

I have checked the ISO 32000 and older PDF 1.4 reference and there seems to be no requirement for the "on" definition to appear before the "off" definition which would explain why a lot of PDFs have the reverse definition layout. The references do, on the other hand, seem to both state that the off-state definition must be named "Off", so this might be a better way to tell which definition is which.

Here is my horrible, rushed, hacked, redundancy filled solution.
In fpdm.php from line 1859 I replaced this

                                        } elseif (($ap_line==$Counter-4)&&($ap_d_line==$Counter-2)&&($ap_d_yes=='')&&$this->extract_pdf_definition_value("name", $CurLine, $match)) {
                                            $ap_d_yes=$match[1];
                                            if ($verbose_parsing) {
                                                echo("<br>Object's checkbox_yes is '<i>$ap_d_yes</i>'");
                                            }
                                            $object["infos"]["checkbox_yes"]=$ap_d_yes;
                                        } elseif (($ap_line==$Counter-5)&&($ap_d_line==$Counter-3)&&($ap_d_no=='')&&$this->extract_pdf_definition_value("name", $CurLine, $match)) {
                                            $ap_d_no=$match[1];
                                            if ($verbose_parsing) {
                                                echo("<br>Object's checkbox_no is '<i>$ap_d_no</i>'");
                                            }
                                            $object["infos"]["checkbox_no"]=$ap_d_no;

With this:

                                        } elseif (($ap_line==$Counter-4)&&($ap_d_line==$Counter-2)&&$this->extract_pdf_definition_value("name", $CurLine, $match)) {
                                            $ap_d_first=$match[1];
                                            if($ap_d_first!="Off") {
                                                if ($verbose_parsing) {
                                                    echo("<br>Object's checkbox_yes is '<i>$ap_d_first</i>'");
                                                }
                                                $ap_d_yes=$ap_d_first;
                                                $object["infos"]["checkbox_yes"]=$ap_d_first;
                                            }
                                            else {
                                                if ($verbose_parsing) {
                                                    echo("<br>Object's checkbox_no is '<i>$ap_d_first</i>'");
                                                }
                                                $ap_d_no=$ap_d_first;
                                                $object["infos"]["checkbox_no"]=$ap_d_first;
                                            }
                                        } elseif (($ap_line==$Counter-5)&&($ap_d_line==$Counter-3)&&$this->extract_pdf_definition_value("name", $CurLine, $match)) {
                                            $ap_d_second=$match[1];
                                            if($ap_d_second!="Off") {
                                                if ($verbose_parsing) {
                                                    echo("<br>Object's checkbox_yes is '<i>$ap_d_second</i>'");
                                                }
                                                $ap_d_yes=$ap_d_second;
                                                $object["infos"]["checkbox_yes"]=$ap_d_second;
                                            }
                                            else {
                                                if ($verbose_parsing) {
                                                    echo("<br>Object's checkbox_no is '<i>$ap_d_second</i>'");
                                                }
                                                $ap_d_no=$ap_d_second;
                                                $object["infos"]["checkbox_no"]=$ap_d_second;
                                            }
                                        } 

Also, just another possible minor issue. It seems that the code is looking for the definitions in the Appearance Dictionary's optional "down appearance" (/D) instead of the required "normal appearance" (/N). This seems like it might also cause issues for some PDFs if they include the "normal appearance" definitions but not the optional "down appearance".

@TristanHammat-AgilisIT
Copy link
Author

For anyone else with this issue, I have now forked this repo and added this rough fix along with a couple of other quick and dirty fixes. Performed some limited testing and so far all good.
https://github.com/TristanHammat-AgilisIT/fpdm/

@belicoffpy
Copy link

For anyone else with this issue, I have now forked this repo and added this rough fix along with a couple of other quick and dirty fixes. Performed some limited testing and so far all good.
https://github.com/TristanHammat-AgilisIT/fpdm/

Hi, Always same problem

@TristanHammat-AgilisIT
Copy link
Author

For anyone else with this issue, I have now forked this repo and added this rough fix along with a couple of other quick and dirty fixes. Performed some limited testing and so far all good.
https://github.com/TristanHammat-AgilisIT/fpdm/

Hi, Always same problem

Even when using my fork?

@PimprenelIe
Copy link

PimprenelIe commented Jul 26, 2021

Hello, this is a solution which work for me.
On line 1878, I initialize in the object["infos"] like this :

             elseif (($as=='')&&$this->extract_pdf_definition_value("/AS", $CurLine, $match)) {
                  $as=$match[1];
                  $object["infos"]["checkbox_yes"] = "";
                  $object["infos"]["checkbox_no"] = "";
                  if ($verbose_parsing) {
                      echo("<br>Object's AS is '<i>$as</i>'");
                  }
                  $object["infos"]["checkbox_state"]=$as;
                  $object["infos"]["checkbox_state_line"]=$Counter;
              }

I add this lines :

$object["infos"]["checkbox_yes"] = "";
$object["infos"]["checkbox_no"] = "";

In order to do the change with the best practices, you can extends FPDM class, and redefines copy/paste the function parsePDFEntries() and make the update here.
After that use your new class instead of \FPDM

Example :

class FPDMupdate extends \FPDM
{

    function parsePDFEntries(&$lines){
             [...]
             elseif (($as=='')&&$this->extract_pdf_definition_value("/AS", $CurLine, $match)) {
                  $as=$match[1];
                  $object["infos"]["checkbox_yes"] = "";
                  $object["infos"]["checkbox_no"] = "";
                  if ($verbose_parsing) {
                      echo("<br>Object's AS is '<i>$as</i>'");
                  }
                  $object["infos"]["checkbox_state"]=$as;
                  $object["infos"]["checkbox_state_line"]=$Counter;
              }
             [...]
     }
}


$pdf = new FPDMupdate("template.pdf");

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants