You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I tried to extract page 69 from this Lloyds pdf document I recieved a PSSyntax error. After investigation I discovered that it is due to the the #885 bug fix. By passing all keywords that end code stream we end up having some true and false booleans split - see the dictionary below.
Bug report
When I tried to extract page 69 from this Lloyds pdf document I recieved a PSSyntax error. After investigation I discovered that it is due to the the #885 bug fix. By passing all keywords that end code stream we end up having some true and false booleans split - see the dictionary below.
pdf: https://www.lloydsbankinggroup.com/assets/pdfs/investors/financial-performance/lloyds-banking-group-plc/2023/q4/2023-lbg-annual-report.pdf
>>from pdfminer.high_level import extract_text
>>text = extract_text("2023-lbg-annual-report.pdf", page_numbers=[69])
>>print(text)
There's a very long error message but this is the final line
PSSyntaxError: Invalid dictionary construct: [/'CS', <PDFObjRef:113318>, /'I', False, /'K', /b'tr', /b'ue', /'S', /'Transparency', /'Type', /'Group']
If you need anymore infomation please feel free to contact me,
LB207
The text was updated successfully, but these errors were encountered: