Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error handling - replace .unwrap and panic with ? #47

Open
joepio opened this issue Feb 11, 2023 · 7 comments
Open

Error handling - replace .unwrap and panic with ? #47

joepio opened this issue Feb 11, 2023 · 7 comments

Comments

@joepio
Copy link
Contributor

joepio commented Feb 11, 2023

Hi there! Thanks for creating an maintaining this :)

I like what this crate does, but I can't really use it as a library in my project because there are many (quite common) paths that will panic. I want to handle errors, not crash my app.

I suggest we change all .unwrap, .expect and panic with ? and Err. We add a custom Result type and implement impl From<&str> for Err for convenience.

UPDATE: I've opened a PR that changes most of the project to return results.

joepio added a commit to joepio/pdf-extract that referenced this issue Feb 11, 2023
joepio added a commit to joepio/pdf-extract that referenced this issue Feb 21, 2023
@griccardos
Copy link

Agreed, great library, with the potential to be very useful. It works so well so often!
Unexpected panics however limit the usefulness even if it only occurs 1 in a 1000 times.
PDF is complicated, so a perfect library may never be possible, but the ability to ignore errors if one wishes in their own project is essential. The use of errors bubbling up gives us this ability.

@scandox
Copy link

scandox commented Aug 27, 2024

I agree. I have an app processing millions of PDFs, extracting text from them. For my use case if the text cannot be extracted I just move on to the next without hard feelings. With this library I can't take this approach.

@jrmuizel
Copy link
Owner

Please share any pdfs that panic. One of the main reasons for tending toward panic instead of returning an error is that it increases the likelihood of people reporting pdfs that cause panics.

@scandox
Copy link

scandox commented Aug 27, 2024

That is a good strategy! Unfortunately most of what I deal with are not shareable on the face of it. However I will look into whether there is some way to do it. Certainly it would be nice to contribute to a more robust extractor.

@jrmuizel
Copy link
Owner

@scandox can you share some of the panic locations/messages?

@joepio
Copy link
Contributor Author

joepio commented Aug 28, 2024

Please share any pdfs that panic. One of the main reasons for tending toward panic instead of returning an error is that it increases the likelihood of people reporting pdfs that cause panics.

This is useful. I think it would make sense to add a panic wrapper that adds an explanation to people to:

  • Make an issue describing the panic, include the PDF that was erring
  • Or: add a PR adding the PDF to the test suite

@scandox
Copy link

scandox commented Aug 28, 2024

Please share any pdfs that panic. One of the main reasons for tending toward panic instead of returning an error is that it increases the likelihood of people reporting pdfs that cause panics.

It looks like there are conditions under which I can share PDFs but not as part of the public repository. If you like you could contact me directly. My email is my username at the well known Google owned email service.

I have about 650 ones that panic. About 22K that cause various unknown errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants