You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We know there are projects (mostly very old) for which renku migrate or renku graph export causes container memory to go infinitely up. As a consequence k8s kills the container and we pick up the same event again and the vicious circle starts.
We could be actively checking the memory of a system process we initiate but before making a change we need to decide:
should we do it keeping in mind we'd cede the process to renku-core (or smth else) at some point?
do we know when this process will be done by some other process?
how we are going to find out the limit knowing that each deployment could have different mem request
is it serious enough (do we have enough projects with the problem) that it's worth doing? If not, we'd need to keep in mind a manual intervention would need to be done each time such a project is processed (rather impossible to do)
I guess if we'd decide to do the fix that we'd need to classify such an error as a GENERATION_NON_RECOVERABLE_FAILURE
The text was updated successfully, but these errors were encountered:
This is getting more urgent now. It happened again and with the logging only and no direct access to the production system, it is very hard to impossible to figure out what to do in the short term.
We know there are projects (mostly very old) for which
renku migrate
orrenku graph export
causes container memory to go infinitely up. As a consequence k8s kills the container and we pick up the same event again and the vicious circle starts.We could be actively checking the memory of a system process we initiate but before making a change we need to decide:
I guess if we'd decide to do the fix that we'd need to classify such an error as a
GENERATION_NON_RECOVERABLE_FAILURE
The text was updated successfully, but these errors were encountered: