Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change panic to Result<> parameter #97

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jankstar
Copy link

hi,
this pull request remove some panics in favor of results.
My PDF ran in some issues with missing objects - sorry for this quick fix (temporary solution); I don't think it's a worsening.
kind regards
JanK

@jrmuizel
Copy link
Owner

Can you provide a link to the PDF?

@jankstar
Copy link
Author

Hi,
sorry, unfortunately not.
The PDF is an invoice and contains personal information about my address and payments.

I have the dump here:

thread 'tokio-runtime-worker' panicked at /Users/jan/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pdf-extract-0.7.10/src/lib.rs:897:16:
unsupported cmap Some(<</Length 5991>>)
stack backtrace:
0: rust_begin_unwind
at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/panicking.rs:652:5
1: core::panicking::panic_fmt
at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/panicking.rs:72:14
2: pdf_extract::get_unicode_map
at /Users/jan/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pdf-extract-0.7.10/src/lib.rs:897:16
3: pdf_extract::PdfSimpleFont::new
at /Users/jan/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pdf-extract-0.7.10/src/lib.rs:422:31
4: pdf_extract::make_font
at /Users/jan/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pdf-extract-0.7.10/src/lib.rs:329:17
5: pdf_extract::Processor::process_stream::{{closure}}
at /Users/jan/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pdf-extract-0.7.10/src/lib.rs:1620:84
6: std::collections::hash::map::Entry<K,V>::or_insert_with
at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/collections/hash/map.rs:2666:43
7: pdf_extract::Processor::process_stream
at /Users/jan/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pdf-extract-0.7.10/src/lib.rs:1620:32
8: pdf_extract::output_doc
at /Users/jan/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pdf-extract-0.7.10/src/lib.rs:2238:9
9: pdf_extract::extract_text
at /Users/jan/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pdf-extract-0.7.10/src/lib.rs:2133:9
10: app_lib::do_status_message_handler::do_status::{{closure}}
at ./src/do_status_message_handler.rs:152:34
11: app_lib::do_status_message_handler::do_status_message_handler::{{closure}}::{{closure}}
at ./src/do_status_message_handler.rs:688:40
....
in the PDF the point of length of the stream (i guess):
...
4 0 obj
<</Length 5991

stream
/CIDInit /ProcSet findresource
begin
12 dict
begin
begincmap
/CIDSystemInfo
<< /Registry (Adobe)
/Ordering (UCS)
/Supplement 0
>>
def
/CMapName /Adobe-Identity-UCS def
/CMapType 1 def
/CMapVersion 1 def
...
kind regards
JanK

@jankstar
Copy link
Author

ps: there is a “>>” missing in my post
'94 0 obj'
'<</Length 5991'
'>>'
'stream'

@greenhat616
Copy link

Thank you. We are also facing with these issued pdfs:
image

In our case, we just use catch_unwind to ignore these panic, however, they are captured by sentry though.

It is not a idea way to treat issues with panic in production program. That's better use Result instead, even if enabled by a feature gate.

@jrmuizel
Copy link
Owner

jrmuizel commented Jan 5, 2025

@greenhat616 A bunch of those crashes should be fixed in pdf-extract 0.8 which I just released. I'd be interested to see a refreshed list of top crashes after you update to 0.8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants