You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Everything works fine until the IP of the Infinispan pod changes (for example, when the Infinispan pod is deleted and then automatically recreated).
When this happens, the Command Router service begins to log 'Closing connection ...' as WARN messages and 'Exception encountered ...' as ERROR messages.
e.g. Closing connection [id: 0x10332e0e, L:/10.1.108.52:32922 ! R:10.1.108.50/10.1.108.50:11222] due to transport error ... (look to log file hono-command-router-infinispan.txt)
The issue is that the Command Router continues to log 'Closing connection' WARN messages and 'Exception encountered ...' ERROR messages with the old IP of Infinispan (10.1.108.50), even though Infinispan is functioning correctly with the new IP (10.1.108.53). During this time, the Command Router's readiness probe is failing. The problem sometimes resolves itself, for example, after 5 minutes (refer to log file hono-command-router-infinispan.txt and find messages with Infinispan's new IP 10.1.108.53), but at other times, the issue persists even after 30 minutes.
It appears that the command router is caching the IP of Infinispan and not attempting to resolve the hostname of Infinispan (in our case, dmp-infinispan) for an extended period of time. Do you have any idea how to fix this problem?
The text was updated successfully, but these errors were encountered:
Am I right in assuming that you are using a single-node Infinispan cluster? If so, then in order to make the setup resilient to crashes, you should switch to a multi-node Infinispan cluster. This will allow the Command Router to fail over to another Infinispan pod, once the one it is currently interacting with, is no longer available.
Hi @sophokles73,
I work on the same project like @petr-cada and took the issue over from him. You're right that we're using a single node Infinispan installation. I've tried to setup a cluster of 2 nodes but it didn't bring any improvement. When I restarted the Infinispan node to which the Command router was connected the Command router was still trying to connect to the old node (previous pod's IP address). When I restarted the Command router it connected to one of the two Infinispan nodes randomly. With any further Infinispan pod restart the same behavior repeated.
We are installing Eclipse Hono to our Kubernetes cluster using a Helm chart:
dependencies:
We are also installing Infinispan to our Kubernetes cluster using a Helm chart:
dependencies:
In our values we have following configuration of command router (to use Infinispan):
Everything works fine until the IP of the Infinispan pod changes (for example, when the Infinispan pod is deleted and then automatically recreated).
When this happens, the Command Router service begins to log 'Closing connection ...' as WARN messages and 'Exception encountered ...' as ERROR messages.
e.g.
Closing connection [id: 0x10332e0e, L:/10.1.108.52:32922 ! R:10.1.108.50/10.1.108.50:11222] due to transport error ...
(look to log file hono-command-router-infinispan.txt)The issue is that the Command Router continues to log 'Closing connection' WARN messages and 'Exception encountered ...' ERROR messages with the old IP of Infinispan (10.1.108.50), even though Infinispan is functioning correctly with the new IP (10.1.108.53). During this time, the Command Router's readiness probe is failing. The problem sometimes resolves itself, for example, after 5 minutes (refer to log file hono-command-router-infinispan.txt and find messages with Infinispan's new IP 10.1.108.53), but at other times, the issue persists even after 30 minutes.
It appears that the command router is caching the IP of Infinispan and not attempting to resolve the hostname of Infinispan (in our case, dmp-infinispan) for an extended period of time. Do you have any idea how to fix this problem?
The text was updated successfully, but these errors were encountered: