Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Examine memory usage for potential leaks #445

Open
hummeltech opened this issue Jun 14, 2024 · 10 comments
Open

Examine memory usage for potential leaks #445

hummeltech opened this issue Jun 14, 2024 · 10 comments

Comments

@hummeltech
Copy link
Collaborator

hummeltech commented Jun 14, 2024

Continuation of discussion started here: #181 (comment)_

@hummeltech
Copy link
Collaborator Author

hummeltech commented Jun 14, 2024

Here's a quick and dirty visualization of the respective memory footprints. All tests are done consecutively, but plotted on the same (relative) time axis.

Conclusions so far:

  • jemalloc seems the only one that returns substantial memory to the system
  • tcmalloc seems to not ever return anything, maybe additional configuration is required?
  • glib does have a dip here and there - are those just "lucky" allocations that reduced the top-of-heap?

Timeline:
0s: start renderd - it loads the mapnik style file within ~10-15 seconds
30s: issue render request for all tiles at zoom level 6
45s-100s: database is churning like crazy, renderd threads are mostly idle
100-220s: renderd threads are using up CPU and memory
220s-250s: render_list exits, all rendering is done
300s: benchmark stops

image

@rolandbosa IIRC, some of mod_tile's dependencies are likely the cause of the high memory consumption displayed in this chart. I will have to more closely look at it again, but thank you very much for compiling this!

@hummeltech
Copy link
Collaborator Author

hummeltech commented Jun 27, 2024

@rolandbosa, thanks for the work, I made a few changes to your branch here (as well as adding mimalloc into the mix), I've tried out the script and am seeing this:
image

@hummeltech
Copy link
Collaborator Author

hummeltech commented Jun 29, 2024

Alright @rolandbosa, I have merged in your MALLOC_LIB (#454) and strtok_r (#455) commits into mod_tile and rebased the above-mentioned branch.

Thanks for the help!

@rolandbosa
Copy link
Contributor

@hummeltech Oh, that merge was a surprise... I consider my branch still work in progress, and it had quite some junk in it. But now I see you just took the "good" parts and merged those. Thank you! I think the jemalloc choice is a step in the right direction!

I don't know if I've mentioned this already, but I managed ("hacked" - definitely don't merge that ;-) ) to get renderd to quit and dump a leak profile. But I'm having problems analyzing the profile in depth and won't have time for a while to work on that. But it looks though like there are no apparent leaks during the renderd::mapnik calls, it's just a lot of caching that sticks around.

So I started thinking of implementing some configurable recycling of the render threads, which might help to cut down on the overall memory usage; of course, at the cost of needing to reload the map style and fonts etc. Maybe with some tweak-able parameters to spread the cost of spinning up a render thread over N tiles, or M seconds, or both. I hope to get back to this when time allows...

@hummeltech
Copy link
Collaborator Author

Haha, yes that's right, just the "best" parts of your use_jemalloc branch, thanks again!

I did try out your exit please command and I do think it is a good idea, but was getting a segmentation fault here. Also, I think it might go better under a renderd option if and when you get back to it. Thanks for your contributions!

@rolandbosa
Copy link
Contributor

Hm, I can see this possibility for the fault: I tried to find a way to distinguish if a statistics thread was created or not by initializing the value of stats_thread with a "null value". Apparently this doesn't exist portably, instead one should use the value of pthread_self() and do a pthreads_equal() comparison. However, according to the man page (and also as qbert220 points out), the value is undefined upon return, so a proper implementation should also re-initialize the variable upon creation failure (in addition to the initialization before checking if a statistics threads is requested at all). Why the creation would fail is a different question though...

In either case, the statistics thread is also sitting in a hard sleep() and that should be rewritten to use a timed condition sleep: sleep for the requested time OR when a exit has been requested. But since this was just testing code to figure out any post-mortem leaks, I chose the "Can't be bothered with this detail now"-way.

These days, renderd does not terminate cleanly (the signal handlers didn't work for me, they seem to be a mechanism for the render threads to indicate a catastrophy (e.g. style/font loading failed)). Wouldn't the proper way be to respond to a signal? I'd imagine that's what e.g. systemd would send to the process e.g. during a shutdown/reboot and nowadays it just gets a kill signal after some time of not responding to it...? How important would that be as a feature?

@hummeltech
Copy link
Collaborator Author

Hm, I can see this possibility for the fault: I tried to find a way to distinguish if a statistics thread was created or not by initializing the value of stats_thread with a "null value". Apparently this doesn't exist portably, instead one should use the value of pthread_self() and do a pthreads_equal() comparison. However, according to the man page (and also as qbert220 points out), the value is undefined upon return, so a proper implementation should also re-initialize the variable upon creation failure (in addition to the initialization before checking if a statistics threads is requested at all). Why the creation would fail is a different question though...

Thanks for the explanation, that sounds like it might be a good addition. The stats thread is optional, so that is at least one reason it would not be created, any other errors should lead to an exit.

In either case, the statistics thread is also sitting in a hard sleep() and that should be rewritten to use a timed condition sleep: sleep for the requested time OR when a exit has been requested. But since this was just testing code to figure out any post-mortem leaks, I chose the "Can't be bothered with this detail now"-way.

Yes, there are a number of potential improvements that can be made,

These days, renderd does not terminate cleanly (the signal handlers didn't work for me, they seem to be a mechanism for the render threads to indicate a catastrophy (e.g. style/font loading failed)). Wouldn't the proper way be to respond to a signal? I'd imagine that's what e.g. systemd would send to the process e.g. during a shutdown/reboot and nowadays it just gets a kill signal after some time of not responding to it...? How important would that be as a feature?

You are correct that this did not exist for a very long time (and still does not in the latest release v0.7.2), it was, however, added recently into the default branch here.

@rolandbosa
Copy link
Contributor

You are correct that this did not exist for a very long time (and still does not in the latest release v0.7.2), it was, however, added recently into the default branch here.

😳 ... and it's also present in my code base! I don't know what I was looking at when I wrote the comment above; consider it retracted. Thanks for pointing it out.

@rolandbosa
Copy link
Contributor

@hummeltech I rewrote the shutdown code a little and hooked it into the signal handler. I have only done minimal testing on this, but feel free to take a look at "the good parts".

I do get a message from jemalloc about leaks now, but I don't recognize any of the symbols:

...
** INFO: 13:19:04.695: Render thread 7/8 exited.
** INFO: 13:19:04.695: Destroying request queue...
** INFO: 13:19:04.695: Request queue destroyed.
<jemalloc>: Leak approximation summary: ~2488287 bytes, ~5 objects, >= 3 contexts
<jemalloc>: Run jeprof on dump output for leak detail

❯ jeprof --text src/renderd jeprof.11418.0.f.heap 
Using local file src/renderd.
Using local file jeprof.11418.0.f.heap.
Total: 2.6 MB
     1.6  61.4%  61.4%      2.1  80.7% std::vector::_M_realloc_insert
     0.5  19.3%  80.7%      0.5  19.3% get_self_exe_path
     0.5  19.3% 100.0%      0.5  19.3% __trans_list_add
     0.0   0.0% 100.0%      0.5  19.3% Option::Option
     0.0   0.0% 100.0%      2.1  80.7% SubProcessTimed::~SubProcessTimed
     0.0   0.0% 100.0%      2.6 100.0% _dl_init
     0.0   0.0% 100.0%      2.6 100.0% _dl_start_user
     0.0   0.0% 100.0%      2.1  80.7% build_options
     0.0   0.0% 100.0%      2.6 100.0% call_init
     0.0   0.0% 100.0%      0.5  19.3% flnl_result_alloc
     0.0   0.0% 100.0%      2.1  80.7% fmt::v7::detail::error_handler::on_error

The above output is from spinning up renderd and immediately asking it to exit (no tile rendering). The output below has force-rendered all tiles at zoom level 4 and 6 and shows the same leaks: I think the rendering code itself seems to be leak-free if allowed to terminate.

** INFO: 14:28:32.856: Render thread 7/8 exited.
** INFO: 14:28:32.856: Destroying request queue...
** INFO: 14:28:32.856: Request queue destroyed.
<jemalloc>: Leak approximation summary: ~2488287 bytes, ~5 objects, >= 3 contexts
<jemalloc>: Run jeprof on dump output for leak detail

❯ jeprof --text src/renderd jeprof.17275.0.f.heap 
Using local file src/renderd.
Using local file jeprof.17275.0.f.heap.
Total: 2.6 MB
     1.6  61.4%  61.4%      2.1  80.7% std::vector::_M_realloc_insert
     0.5  19.3%  80.7%      0.5  19.3% get_self_exe_path
     0.5  19.3% 100.0%      0.5  19.3% __trans_list_add
     0.0   0.0% 100.0%      0.5  19.3% Option::Option
     0.0   0.0% 100.0%      2.1  80.7% SubProcessTimed::~SubProcessTimed
     0.0   0.0% 100.0%      2.6 100.0% _dl_init
     0.0   0.0% 100.0%      2.6 100.0% _dl_start_user
     0.0   0.0% 100.0%      2.1  80.7% build_options
     0.0   0.0% 100.0%      2.6 100.0% call_init
     0.0   0.0% 100.0%      0.5  19.3% flnl_result_alloc
     0.0   0.0% 100.0%      2.1  80.7% fmt::v7::detail::error_handler::on_error

@frafra
Copy link

frafra commented Nov 28, 2024

Debian bookworm: renderd used all the available memory (64 GB + 32 GB swap). I switched to jemalloc and memory usage is usually below 5 GB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants