Supervision is a special system attached to each container which allows defining strategy for handling incoming faults and problems. This article describes supervision system implementation in Kraken Framework and shows how it can be used.
Supervision is a dependency relationship description between containers. The supervisor, which is entry point of supervision system, oversees subordinates and therefore must respond to their failures. When a subordinate detects a failure, i.e. throws an exception, it searches itself for proper strategy and then applies it. All the problems are solved using set of solvers. Depending on the nature of the work to be supervised and the nature of the failure, the supervision system has a choice of the following two options:
It is important to always view container as a part of a supervision hierarchy. Each container is a subordinate to its parent supervision system. The only exception from this rule is root container, which posses no remote supervision system. To ensure root container works properly all the time you can assign a system supervisor to it. You are able to read more about this in project deployment.
Local supervision is the first layer of supervision system. Each container has its own local supervision mechanisms that should be used to solve easy problems which does not require to keep reaction from its parent.
Remote supervision is the second layer of supervision system. It is placed in the container's parent's domain. The request to the remote supervisor is done automatically if unhandled exception is not valid to solve for a local system.
Each supervisor is configured with a function translating all possible failure causes into solvers, called strategy. Solvers do not take the failed actor’s identity as an input, which might not seem flexible, but at this point it is vital to understand that supervision is about forming a recursive fault handling structure. If you try to do too much at one level, it will become hard to reason about.
The supervision strategy is a set of possible failure identifies accompanied with an array of solvers which are designed to solve it. Using strategy is similar to try-catch
block, in the sense, that the first valid record is executed, and the search function then terminates. Therefore, it is extremely important to understand that the order of solvers matters.
To apply the strategy one should modify supervision.base
and supervision.remote
configuration options of any container or directly modify existing supervisor via Supervisor.Base
and Supervisor.Remote
services. The details about supervisors are presented in supervision API article.
Triggering the supervisor is done automatically by all unhandled errors and exceptions. It might also be done manually by calling fail
method of Kraken\Runtime\RuntimeContainerInterface
which references current container. When trigger happens, your application changes state from started
to failed
, switches its workflow and ceases all non-solving related operations. In short, it switches your container to maintenance mode. To mark the problem as solved, one of the solvers have to call succeed
method, which will switch back failed
state to the started
one and resume unfinished callbacks. It is very important to know, that solvers do not call this method automatically, meaning you have to remember about this yourself. It is a good practice to keep ContainerContinue
solver at the end of each solving chain.
{warning} Calling
succeed
method will automatically set your container flow back to the default state halting the rest of solvers, as the problem has been marked as solved. That's why it is important to keep succeeding as the last solving directive.
Delegating solving process to remote supervisor can be done from local supervisor, using one of the solvers, after it had entered failed
state. To do this, your solver should call its parent's cmd:error
command passing proper exception
and message
param or easier via using CmdEscalate
solver.
Solver is a piece of code that is executed by supervisor as a part of strategy to solve occurred problem. Solvers might be compared to special case of commands. Kraken provides your application a set of solver to choose from. All of the possible default solvers are defined inside Kraken\Runtime\Supervision
namespace.
There are three families of default solvers:
The cmd
solvers' purpose is to provide your application with the set of helpers to work with errors. They allow logging, escalating and other non-container related operations.
The container
solvers are set of solvers which allow to work on container that has thrown exception itself. They should be used in local supervision and allows basic container-related operations.
The runtime
solver are set of solvers which are designed to work on remote containers. They should be used in remote supervision and allows full-set of container-related operations, including restarting a child.