You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We need to make sure that #48 does not occur in other patterns. From what I can tell, it looks like the Processing Pattern is the only other place where this possibly happens.
To be able to redeploy a cluster member correctly you need to detach the member from the cluster programmatically calling CacheFactory.shutdown() when application is undeployed. Then the method CacheFactory.shutdown() will call the stop() methods of the distributed services that runs on the leaving member.
Because the stop() method is run by a service thread, no reentrant service calls should be invoked inside the stop method to avoid deadlocks.
THE PROBLEM
The CommandExecutor.stop() have a CacheFactory.ensureCluster() that is a service call within a service call (thus, a reentrant call)
public void stop() {
if (Logger.isEnabled(Logger.DEBUG)) Logger.log(Logger.DEBUG, "Stopping CommandExecutor for %s", contextIdentifier);
//stop immediately setState(State.Stopped);
//this CommandExecutor must not be available any further to other threads CommandExecutorManager.removeCommandExecutor(this.getContextIdentifier());
//unregister JMX mbean for the CommandExecutor Registry registry = CacheFactory.ensureCluster().getManagement(); // THIS IS THE SERVICE CALL if (registry != null) {
if (Logger.isEnabled(Logger.DEBUG)) Logger.log(Logger.DEBUG, "Unregistering JMX management extensions for CommandExecutor %s", contextIdentifier);
registry.unregister(getMBeanName());
}
if (Logger.isEnabled(Logger.DEBUG)) Logger.log(Logger.DEBUG, "Stopped CommandExecutor for %s", contextIdentifier);
}
If the distributed service use to support the command pattern is configured to have a single thread (as it is by default). This call will produce a deadlock with a thread dump like this:
DIAGNOSTIC (AND POTENTIAL SOLUTION)
I've changed the code of the CommandExecutor.stop() method to use a non blocking service call to obtain the Cluster
…, replacing with CacheFactory.getCluster()
Issue #159: Introduced ability to provide a ConfigurableCacheFactory when creating a ProcessingSession
Issue #160: Ensure consistent use of ClassLoaders based on calling context
Issue #161: Ensure Processing Pattern is initialized using the Cache Configuration LifecycleEvents
Issue #162: Introduce Shared ExecutorService for internal background tasks
Issue #163: Resolves fail-over/fail-back of Grid-based Tasks
…, replacing with CacheFactory.getCluster()
Issue #159: Introduced ability to provide a ConfigurableCacheFactory when creating a ProcessingSession
Issue #160: Ensure consistent use of ClassLoaders based on calling context
Issue #161: Ensure Processing Pattern is initialized using the Cache Configuration LifecycleEvents
Issue #162: Introduce Shared ExecutorService for internal background tasks
Issue #163: Resolves fail-over/fail-back of Grid-based Tasks
We need to make sure that #48 does not occur in other patterns. From what I can tell, it looks like the Processing Pattern is the only other place where this possibly happens.
The text was updated successfully, but these errors were encountered: