Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very slow decoding #40

Closed
novitae opened this issue Jul 29, 2024 · 4 comments
Closed

Very slow decoding #40

novitae opened this issue Jul 29, 2024 · 4 comments

Comments

@novitae
Copy link

novitae commented Jul 29, 2024

Hello, I am using blackboxprotobuf to decode a YouTube response, and it's very slow. The response file weighs 1.3mb, and it took me 19 minutes to decode it, as shown below.
Capture d’écran 2024-07-29 à 15 19 38
Even with the typedef it's extremely slow. Here's the file:
slow_proto.txt

Is there something I can do to improve this ?

@rwinkelmaier-ncc
Copy link
Collaborator

Hi! Thanks for the report and for providing the example file. I was able to replicate and run the profiler.

It looks like most of the time is being spent within copy.deepcopy calls. Not sure if there's something special about this protobuf that triggered it, or just the large size, but I'll look into improving the time. Removing these three deepcopy calls (https://github.com/search?q=repo%3Anccgroup%2Fblackboxprotobuf%20copy.deepcopy&type=code) should drastically speed up the parsing, but there might be some edge cases that don't work correctly. I'll have to think through those cases and figure out the best way to handle without as many copies. Might be a good chance to rewrite the clunky alt_typedef system as well.

@rwinkelmaier-ncc
Copy link
Collaborator

Hi! I just pushed a fix to the repo which should fix this and improve performance generally. I think it could have been fixed with a few strategic "copy.copy" calls, but I wanted to be sure I wasn't missing any edge cases so ended up rewriting quite a bit. There's definitely more room for performance improvements as well.

I haven't pushed to pypi yet, but will next week after I double check a few more things.

@novitae
Copy link
Author

novitae commented Aug 8, 2024

Nice ! I think you might want to give a try to dict.copy or list.copy, that is builtin those classes. Other values (str, int, float, bool and None) are copied by default, so they don't need it.

@rwinkelmaier-ncc
Copy link
Collaborator

Ok. The changes should be pushed to pypi now as version 1.4.0. Let me know if you have any issues.

I tried a few different tweaks to improve performance further, such as doing copy-on-write and protocols instead of full classes, but didn't see any performance gains for this payload so left it as-is for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants