-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Solr process eventually runs of out memory #538
Comments
This happened again. I'd like to escalate the priority of this issue. |
Potentially related: datamade/django-councilmatic#205 |
This article suggests that frequent updates require a bigger heap size. The staging Solr index is updated once per day. The production Solr index is updated every 15 minutes, or 96 times per day, fully a quarter of which reindex every bill in the database. That could be one reason we're seeing this on production, but not staging. I monitored the production Solr instance while a full reindex was taking place. Heap use hovered between 40 and 60% of the allocated memory (half a gig). This doesn't seem like enough to run out of memory, so I wonder if there's a leak somewhere that gradually increases heap use. In that case, increasing heap size may only be a band-aid. I've increased heap size on a branch, but I'd actually like to hold off on merging and check on this once a week for a few weeks to get a handle on whether heap use is creeping up, or whether our errors come from more of a shock to the system. |
this does sound like a memory leak. first thing I would try in this case is to upgrade solr. |
i think you monitoring plan is also good. |
This happened again after three weeks. |
Yikes, this happened again on the new server. I'd like to escalate this issue in the next month or two. |
Woah! This blog post is very, very helpful in tuning memory needs for Solr. In particular, it offers an explanation for how Solr uses memory. Most notably:
So, a compelling reason why production index updates eventually fail is because Solr's various caches grow large enough that there is no longer sufficient heap space to make updates. This would also explain why restarting Solr frees up space. Since the staging site is nowhere near as regularly used as the production site, it would also explain why we don't see this on staging. I think we will need to do a combination of limiting the max size of the caches, and perhaps giving the production Solr index a bit more memory to work with, to solve this issue. Will continue reading and update this thread. |
You can view stats on the various caches in the Solr admin by selecting your core in the lefthand menu, then navigating to Plugins / Stats > Cache. The big one for us is the document cache, which is at its max size of 512. With 2514 docs totaling to 652.8 MB, we can estimate that each our documents weighs about 0.25 MB. That means our document cache is around 128 MB in size, or a quarter of our available heap space. There are some items in the query and filter caches, as well, but neither is close to full. According to this article, those are the ones that can potentially get quite big. I could spend a lot of time further spelunking here, but I think we'll see diminishing returns to the precision / time spent. I'm going to bump the production Solr heap size up to 1 GB (double its current heap size) and continue monitoring this thread. |
Excellent research so far! Three questions:
|
Thank you for these excellent prompts, @jeancochrane! I've increased Solr's memory in production, and I'll keep an eye on this issue. If we wind up needing further work, I'll start with these questions. |
Closing since we're on ElasticSearch now 🙂 |
Offshoot of #534, related to #535.
After running without issue for about seven months, the production Solr process ran out of memory to accept new updates. Restarting the process freed up enough memory to resolve the issue, however this is only a temporary solution.
By default, Solr caps memory use ("heap size") at about half a gig. The docs suggest that this will be insufficient for most production setups. We probably don't need the recommended 10GB, but a middle ground may be more appropriate, especially given the size of our documents.
This thread contains some guidance on getting a handle on the memory consumption of Solr processes. This may help us determine a saner value.
This post also looks like a good resource for growing heap size and ways of addressing it.
The text was updated successfully, but these errors were encountered: