Scality S3 exposes a healthcheck route /live
on the port used
for the metrics (defaults to port 8002) which returns a
response with HTTP code
- 200 OK
Server is up and running
- 500 Internal Server error
Server is experiencing an Internal Error
- 400 Bad Request
Bad Request due to unsupported HTTP methods
- 403 Forbidden
Request is not allowed due to IP restriction
The healthcheck route's successful response (200 OK) is appended with additional statistics in the request body indicating the number of requests performed, number of 500 errors occurred over the time interval specified in the response.
A sample response would look something like
{
"requests": 5000,
"500s": 2,
"sampleDuration": 30
}
The goal is to return stats for the set interval, i.e., if interval is 30 seconds, return stats only for the last 30 seconds.
The stats use simple keys with INCR command for every new push. Each key is appended with a normalized unix timestamp, as the idea is to store the stats in 5 second interval(default but configurable) keys. A default TTL of 30 seconds is associated with each key, this way any keys older than the TTL are automatically removed.
When a stats query is received, the results for the prior 30 seconds will be returned. This is accomplished by retrieving the 6 keys that represent the 6 five-second intervals. As Redis does not have a performant RANGE query, the list of keys are built manually as follows
-
Take current timestamp
-
Build each key by subtracting the interval from the timestamp (5 seconds)
-
Total keys for each metric (total requests, 500s etc.) is TTL / interval 30/5 = 6
Note: When Redis is queried, results from non-existent keys are set to 0.
To gather stats, S3 uses a local Redis instance as a temporary
datastore. By adding the following config to config.json
, stats
will be recorded in Redis.
{
"localCache": {
"host": "127.0.0.1",
"port": 6379
}
}