Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: improved hsv color mapping in HeatMapAnnotator by 20x #1786

Merged
merged 1 commit into from
Feb 19, 2025

Conversation

Armaggheddon
Copy link
Contributor

Description

Replaced np.zeros() and subsequent fill operations with np.full(). This improves performance of the operation on a 1920x1080 frame by ~16x.

Type of change

  • Performance improvement

How has this change been tested, please provide a testcase or example of how you tested the change?

Tested running in a loop using time.perf_counter() the default implementation with the updated one
Results are average over 10000 iterations

  • with 3840x2160 frame: default takes average of 62ms, improved takes average of 5ms. Improvement of 11.5x
  • with 1920x1080 frame: default takes average of 16ms, improved takes average 1ms. Improvement of 16x
  • with 1280x720 frame: default takes average of 8ms, improved takes average 0.3ms. Improvement of 22.5x

All comparisons have been performed on the same hardware.

Any specific deployment considerations

No additional requirements

@CLAassistant
Copy link

CLAassistant commented Feb 10, 2025

CLA assistant check
All committers have signed the CLA.

@SkalskiP
Copy link
Collaborator

Hi @Armaggheddon 👋🏻 Thank you so much for your interest in supervision. It looks interesting. Could you share a script (preferably a Google Colab) where we could measure the performance boost?

@Armaggheddon
Copy link
Contributor Author

Hi @SkalskiP !. I am happy to contribute😁. Sure below is the link to a colab notebook that implements just the core logic i committed and compares my vs the current implementation. The code should be pretty easy to follow. The notebook gives slightly different trends compared to the one i observed on my machine, however the improvement is still noticeable. Here is the LINK

These are the results I obtained running 100 iterations for each resolution:

Measuring perf with 
- (720, 1280, 3):
	- [BASE] Avg:  10.109914 ms, Total:  1010.991401 ms
	- [IMPROVED] Avg:  1.055095 ms, Total:  105.509520 ms
	==>  9.58x times faster
- (1080, 1920, 3):
	- [BASE] Avg:  98.214145 ms, Total:  9821.414546 ms
	- [IMPROVED] Avg:  3.421949 ms, Total:  342.194868 ms
	==>  28.70x times faster
- (2160, 3840, 3):
	- [BASE] Avg:  243.822512 ms, Total:  24382.251157 ms
	- [IMPROVED] Avg:  10.611352 ms, Total:  1061.135250 ms
	==>  22.98x times faster

Hope it helps!

@SkalskiP SkalskiP merged commit acf1ef6 into roboflow:develop Feb 19, 2025
2 checks passed
@SkalskiP
Copy link
Collaborator

Thanks a lot @Armaggheddon! 🔥 Merging!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants