Skip to content

Latest commit

 

History

History
80 lines (49 loc) · 6.07 KB

supervision.md

File metadata and controls

80 lines (49 loc) · 6.07 KB

Supervision

Introduction

Supervision is a special system attached to each container which allows defining strategy for handling incoming faults and problems. This article describes supervision system implementation in Kraken Framework and shows how it can be used.

Supervision & Monitoring

Supervision

Supervision is a dependency relationship description between containers. The supervisor, which is entry point of supervision system, oversees subordinates and therefore must respond to their failures. When a subordinate detects a failure, i.e. throws an exception, it searches itself for proper strategy and then applies it. All the problems are solved using set of solvers. Depending on the nature of the work to be supervised and the nature of the failure, the supervision system has a choice of the following two options:

- Solve the problem using local supervisor - Escalte the problem delegating it to remote supervisor

It is important to always view container as a part of a supervision hierarchy. Each container is a subordinate to its parent supervision system. The only exception from this rule is root container, which posses no remote supervision system. To ensure root container works properly all the time you can assign a system supervisor to it. You are able to read more about this in project deployment.

Local Supervision

Local supervision is the first layer of supervision system. Each container has its own local supervision mechanisms that should be used to solve easy problems which does not require to keep reaction from its parent.

Remote Supervision

Remote supervision is the second layer of supervision system. It is placed in the container's parent's domain. The request to the remote supervisor is done automatically if unhandled exception is not valid to solve for a local system.

Translating The Failures

Each supervisor is configured with a function translating all possible failure causes into solvers, called strategy. Solvers do not take the failed actor’s identity as an input, which might not seem flexible, but at this point it is vital to understand that supervision is about forming a recursive fault handling structure. If you try to do too much at one level, it will become hard to reason about.

Applying The Strategy

The supervision strategy is a set of possible failure identifies accompanied with an array of solvers which are designed to solve it. Using strategy is similar to try-catch block, in the sense, that the first valid record is executed, and the search function then terminates. Therefore, it is extremely important to understand that the order of solvers matters.

To apply the strategy one should modify supervision.base and supervision.remote configuration options of any container or directly modify existing supervisor via Supervisor.Base and Supervisor.Remote services. The details about supervisors are presented in supervision API article.

Triggering The Supervisor

Triggering the supervisor is done automatically by all unhandled errors and exceptions. It might also be done manually by calling fail method of Kraken\Runtime\RuntimeContainerInterface which references current container. When trigger happens, your application changes state from started to failed, switches its workflow and ceases all non-solving related operations. In short, it switches your container to maintenance mode. To mark the problem as solved, one of the solvers have to call succeed method, which will switch back failed state to the started one and resume unfinished callbacks. It is very important to know, that solvers do not call this method automatically, meaning you have to remember about this yourself. It is a good practice to keep ContainerContinue solver at the end of each solving chain.

{warning} Calling succeed method will automatically set your container flow back to the default state halting the rest of solvers, as the problem has been marked as solved. That's why it is important to keep succeeding as the last solving directive.

Delegating Solving

Delegating solving process to remote supervisor can be done from local supervisor, using one of the solvers, after it had entered failed state. To do this, your solver should call its parent's cmd:error command passing proper exception and message param or easier via using CmdEscalate solver.

Basic Solvers

Solver is a piece of code that is executed by supervisor as a part of strategy to solve occurred problem. Solvers might be compared to special case of commands. Kraken provides your application a set of solver to choose from. All of the possible default solvers are defined inside Kraken\Runtime\Supervision namespace.

There are three families of default solvers:

The cmd Solvers

The cmd solvers' purpose is to provide your application with the set of helpers to work with errors. They allow logging, escalating and other non-container related operations.

The container Solvers

The container solvers are set of solvers which allow to work on container that has thrown exception itself. They should be used in local supervision and allows basic container-related operations.

The runtime Solvers

The runtime solver are set of solvers which are designed to work on remote containers. They should be used in remote supervision and allows full-set of container-related operations, including restarting a child.