-
Notifications
You must be signed in to change notification settings - Fork 265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
catch on the exception mongo::UserException [it was "container crashes with exit code 139"] #3326
Comments
So it happens all the time? Or only when your PC is loaded (I mean, when you PC is not loaded, the problem doesn't happen)? |
It happens quite intermittently... |
How much memory has the CB container allocated? |
I use the default from docker-compose. I'm not sure how mush is this, probably 2Go? |
Here is the result of docker logs and inspect:
Notice that it happens only half a second after startup. |
A problem in the starting order (i.e. CB container is started before MongoDB is started) could make sense. In order to confirm/discard you could do the following test:
More detail on how to run individual container instead of using docker-compose.yml can be found at https://github.com/telefonicaid/fiware-orion/tree/master/docker#2b-mongodb-runs-on-another-docker-container |
I did as suggested, it works in both cases (mongo before orion, orion before mongo) with appropriate messages. |
In any case it seems to be an issue more related with docker or with the available resources in your hosting system than with Orion itself. What do you think? |
I agree. I'm wondering how to debug it, though. Do you know how I can trace/graph the memory consumption of dockers? |
It happens really consistently, on my PC, on travis and on colleagues PC.
Run with:
My system is not loaded at the moment. With this setting, it happens systematically. |
It shouldn't but, in order to discard it, I'd suggest to use MongoDB 3.4 and check if the problem stills or not. |
I updated to 3.4, same problem. |
It does seem to be related to memory, after all. After using Skype, my system presented little memory available:
Cleaning it made the problem go away:
Now Orion starts correctly. |
It could be. A way of testing it would be to have a tail -f in the mongo docker log and in the context broker docker log. Thus, you can check if the error occurs only when Orion has ended startup but MongoDB is still on it. |
I seem to reproduce the problem more often when loading memory:
|
Anyway, a solution could be to add a catch on the exception |
Orion already implements a process to retry connections to MongoDB, giving up after a number of retries if the process fails. However, the idea of capturing the exception makes sense (at least to print the proper error message in the logs :). Let's have this issue opened for that (low priority, I guess) |
As a workaround, I added an healthcheck to mongo; and orion need waits that mongo is healthy.
|
Great! Maybe it would be a good idea to include this in the documentation to help other users in the same situation. In particular in docker/README.md. I mean, in a new section 4 Troubleshooting (old section 4 will be now section 5) with a 4.1 subsection for this specific case. What do you think? Would you like to propose a PR with that piece of documentation? |
But this is still considered a bug, no? Orion should catch the Exception. |
Yes, we can keep the issue open in order to continou discusion (and eventually fixing) about Orion Exception. Regarding this, two things to consider:
|
Apparently |
Is there any alternative or workaround for docker-compose v3? |
Alternative is to use a tool like wait-for-it or dockerize. |
One possibility would be to describe both solutions (for docker-compose v2 and v3). However, if you prefer to keep things simple, I'd suggest to describe the one you have and know it works (based on docker-compose v2) with a final note stating that service_healthy is not going to work in v3 and that wait-for-it should be used, but without going into too much detail. Does it make sense to you? |
Well, I fixed one of my other containers that was consuming a lot of memory on start-up. |
The problem still bites me very often:
In which case I have to do |
Is there any deterministic way to reproduce the problem? It is hard to debug a problem that sometimes happens sometimes doesn't ;) |
Unfortunately no deterministic way, but it happens very often. |
I still have this issue often with the latest version (2.2.0). If you are in FIWARE summit in Genova I could show you :) |
Unfortunatelly, I'm not at FIWARE summit :( What I would need is a deterministic way of reproducing the problem (i.e. "deploy this given docker-compose.yml, then run request X, the run request Y, etc.") to be able of debugging it. |
It's an old thread, but, can help someone stucked in this same point. If you got the docker-compose file from the samples code of FIWARE, you will need to define a depends_on directive. like
In this way the broker only will start AFTER mongo! |
Thanks you for the hint @anselmobattisti! The thread is somehow old, but still open, so your feedback is valuable :) @cdupont could you have a look and see if it helps, please? Thanks! |
@anselmobattisti good hint, thanks. |
Is there a way to configure in docker-compose a script to be run from Orion container to check if MongoDB container is ready? Maybe it could be a way to explore. It seems that scripts like that are available out there (for instance, check https://stackoverflow.com/questions/15443106/how-to-check-if-mongodb-is-up-and-ready-to-accept-connections-from-bash-script) |
Yes, it's definitely possible. |
MongoDB driver has been completely replaced in PR #3622. If this problem is still happening it will appear in a completely different form, given that the new driver doesn't have the capability of raising any exception (including mongo::UserException). @cdupont I think is better to close this issue and, if it appears again, opening a new issue about it, please. |
Fermin, that's great. I'll open another issue if necessary. |
I recently upgraded Orion to version 1.15.
However it crashes regularly with exit code 139.
For instance:
It seem related to a memory problem.
I run the docker version (with a bunch of other docker containers).
This bug happens when my PC is quite loaded (running many tasks together). It happens systematically on Travis CI.
The text was updated successfully, but these errors were encountered: