English | 中文
It can check the health status of all instances of all services in the cluster, as well as the health status of dependencies such as MySql, Redis, OSS, ETCD, etc., for easy scalability.
Background
-
During the operation of services and components, unavailability failures are inevitable. In addition to ensuring high availability, timely warning and recovery of failures are also necessary to avoid causing global unavailability.
-
The health check component only needs to add an active health check component client dependency to the service to be checked to achieve the health check and alarm functions of the service.
-
For components such as databases, simple configuration is required to achieve second level perception of middleware failures, rather than waiting for customer feedback, improving fault handling speed and customer satisfaction with the product.
The active health check and alarm component is divided into three parts, namely client, server, and alarm service.
This repository contains all the code for the client and server, excluding alarm services. Due to the different alarm requirements of different businesses, you can develop your own alarm services according to your own business needs, store the inspection results, and then formulate corresponding alarm rules and channels.
Passive health check:SpringBoot
Proactive health check:Mysql、redis、etcd、oss
To be supplemented.......
Service Information Storage Structure:
/**
* namespace, serviceName : set<instance>
*/
public final Map<String, Map<String, Service>> serviceMap = new ConcurrentHashMap<>();
Can check the health status of all instances of all services in the cluster, as well as the health status of dependencies such as MySql, Redis, OSS, ETCD, etc
This component is easy to integrate into the Spring Boot project with minimal code intrusion.
<dependency>
<groupId>rpa</groupId>
<artifactId>health-client</artifactId>
<version>0.0.1</version>
</dependency>
Configuration example:
# HealthChecker Server deployment address
health-server:
info:
ip: 127.0.0.1
port: 9001
namespace: dev
a、Enable the health check function of the corresponding middleware in the health server, where enableCheck is true to indicate that the health check function of the middleware is enabled. Then configure the middleware connection information.
server:
port: 9001
spring:
application:
name: health-server
health:
mysql:
enableCheck: false # Is MySQL health check enabled? True enabled, false disabled
ip: 172.30.92.11 # MySQL access address
port: 3306 # MySQL access port
database: rpa # MySQL database name
user: root
pwd: '*&T3*1(%imk@VB'
redis:
enableCheck: false # Is Redis health check enabled? True enabled, false disabled
ip: 172.30.92.11
port: 6379
pwd: "&r6Fe$^7%NBm"
s3:
enableCheck: false # Is S3 health check enabled? True enabled, false disabled
ip: "172.30.92.11:9000"
bucketName: rpa
prefix: resource/
secretKey: "dyAwnn1eGIlxEepisVQAjHm6qzxXbF3x"
accessKey: "grMSBA3SXispbLaB"
etcd:
enableCheck: false # Is the ETCD health check enabled? True enabled, false disabled
ip: 172.30.92.11
port: 22379
b、If there are other middleware that require health checks, this component supports extension. The extension steps are as follows:
For Redis configuration: add necessary annotations, set type values (such as Redis), and add connection information fields, as shown in the following figure:
Add @ HealthChecker annotation and set the type value. Implement the HealthCheckProcessor interface and start a thread for health checks, as shown in the following figure:
Asynchronous
Blocking queue
Put the task of service registration into a blocking queue and use thread pool stepping to complete instance updates, thereby improving concurrent writing capabilities.
Thread pool
Timed threads can be reused
Connection pool
Database check connection reusability
Double-Checked Lock
Ensure that there is only one instance in the application
Observer mode
Minimize dependencies as much as possible, making them easier to maintain and with low coupling
Registration processing, heartbeat processing, sending HTTP requests, and other functions are separately packaged into classes
Strategy mode
Select the corresponding HealthCheckProcessor instance based on the type of task for processing, and delegate the specific processing logic to the corresponding processor
synchronized
Lock the action of modifying the service list to avoid security issues of concurrent modifications
Double-Checked Lock
Ensure thread safety and reduce synchronization overhead
CopyOnWrite
In the addIPAddress method, the old instance list will be copied and a new instance will be added to the list. After updating the instance status, the old instance list will be directly overwritten with the new list. During the update process, the old instance list is not affected and can still be read while processing heartbeats and determining health status.
Health server is highly available, enabling multi instance deployment and data synchronization between nodes