Starometry is a service designed for the collection and management of metrics for the c12s platform. This service gathers metrics from cAdvisor and node-exporter, enabling real-time monitoring of both machine states and virtualized Docker containers running on the machine.
Starometry supports two types of configurations that can be provided via environment variables. The first type pertains to the application itself, while the second type is related to the metrics.
Parameter | Description | Default value |
---|---|---|
APP_PORT | The port number at which the Starometry HTTP server is listening. If there are multiple Starometry agents running, each of them will be assigned a port number starting from APP_PORT + 1. | 8003 |
GRPC_PORT | The port number at which the Starometry GRPC server is listening. If there are multiple Starometry agents running, each of them will be assigned a port number starting from GRPC_PORT + 1. | 50055 |
NODE_EXPORTER_URL | The address without the port that allows Starometry to communicate with the node-exporter running on the same node. | node_exporter |
NODE_EXPORTER_PORT | The port at which the node exporter is running. | 9100 |
CADVISOR_URL | The address without the port that allows Starometry to communicate with the cAdvisor running on the same node. | cadvisor |
CADVISOR_PORT | The port at which the cAdvisor is running. | 8081 |
NATS_URL | The address without the port that allows Starometry to communicate with the NATS running on the Control Plane | nats |
NATS_PORT | The port at which the Nats is running. | 4222 |
Parameter | Description | Default value |
---|---|---|
APP_METRICS_CONFIG | List of metrics that you want to scrape from cAdvisor or node-exporter, separated as CSV. | Link to default list of metrics |
APP_METRICS_CRON_TIMER | Value for the cron job timer that defines how often the scrape for metrics will be executed. It is important to note that you must add 's' for seconds or 'm' for minutes at the end. | 45s |
APP_METRICS_EXTERNAL_CRON_TIMER | Value for the cron job timer that defines how often the scrape for external metrics will be executed. It is important to note that you must add 's' for seconds or 'm' for minutes at the end. | 45s |
Small example of APP_METRICS_CONFIG
would be: container_cpu_usage_seconds_total,container_spec_cpu_quota
The Starometry for HTTP requests is, by default, available at http://localhost:8003. It can be accessed via any tool that allows you to send HTTP requests. For each instance, just add +1 to the port number.
The Starometry for gRPC requests is, by default, available at 127.0.0.1:50055. For each instance of Starometry, just add +1 to the port number. Refer to the start.sh for more information.
There are two types of endpoints: gRPC and HTTP.
{
"status": 200,
"data": {}
}
{
"status": 400,
"path": "path",
"time": "2024-07-09",
"error": "Error"
}
The endpoint for reading latest written metrics.
None
None
{
"nodeId": "e984c7e0-0f83-4870-81e7-0424595c90a5",
"metrics": [
{
"metric_name": "container_network_transmit_bytes_total",
"labels": {
"id": "/",
"interface": "br-0758707fa6ae"
},
"value": 22096194,
"timestamp": 1720546066
},
]
}
The endpoint for adding new configuration metrics.
{
"queries": [
"node_filesystem_avail_bytes"
]
}
property | type | description |
---|---|---|
queries |
array of strings | Array of strings that are metric names. |
{
"status": 200,
"data": {
"status": "OK"
}
}
The endpoint for getting latest metrics.
None
{
"data": {
"metrics": [
{
"labels": {
"id": "/"
},
"metric_name": "container_memory_usage_bytes",
"value": 4999598080,
"timestamp": "1720547417"
},
],
"node_id": "038a427d-0c78-495d-8781-07cf2707798d"
}
}
The endpoint for adding new metrics in configuration.
{
"metrics": [
"node_filesystem_avail_bytes"
]
}
property | type | description |
---|---|---|
metric |
array of strings | Array of strings that are metric names. |
{
"data": {
"metrics": [
{
"labels": {
"id": "/"
},
"metric_name": "container_spec_cpu_period",
"value": 0,
"timestamp": "1720547610"
},
],
"node_id": "038a427d-0c78-495d-8781-07cf2707798d"
}
}
The endpoint for adding new addresses for external applications.
{
"external_applications": [
{
"address": "example-address:8080"
},
],
}
property | type | description |
---|---|---|
external_applications |
array of applications-url objects | Array of applications-url. |
address |
string | String value of the URL. |
{
"external_applications": [
{
"address": "external-app"
}
]
}
Metrics listed below are must have and always included.
- container_cpu_usage_seconds_total:
- Description: Total cumulative CPU usage of the container.
- Details: Measures the total CPU time consumed by the container in seconds. This includes both user and system CPU time.
- container_spec_cpu_quota:
- Description: CPU quota limit set for the container.
- Details: Indicates the maximum amount of CPU time that the container can use during a given period. This is specified in microseconds.
- container_memory_usage_bytes:
- Description: Current memory usage of the container.
- Details: Shows the total memory usage in bytes, including all memory required by the container's processes, cache, and buffers.
- container_spec_memory_limit_bytes:
- Description: Memory limit set for the container.
- Details: Specifies the maximum amount of memory the container is allowed to use, in bytes.
- container_fs_usage_bytes:
- Description: File system usage by the container.
- Details: Represents the total disk space used by the container's filesystem, in bytes.
- container_spec_cpu_period:
- Description: CPU period for container scheduling.
- Details: Defines the length of the time period in microseconds for CPU allocation, used in conjunction with container_spec_cpu_quota to control CPU resource allocation.
- container_network_receive_bytes_total:
- Description: Total bytes received by the container.
- Details: Measures the total number of bytes received over the network interfaces of the container.
- container_network_transmit_bytes_total:
- Description: Total bytes transmitted by the container.
- Details: Measures the total number of bytes sent over the network interfaces of the container.
- node_cpu_seconds_total:
- Description: Total CPU usage of the node.
- Details: Represents the cumulative CPU time used by all processes on the node, in seconds.
- node_memory_MemTotal_bytes:
- Description: Total memory available on the node.
- Details: Shows the total amount of physical memory (RAM) available on the node, in bytes.
- node_memory_MemAvailable_bytes:
- Description: Available memory on the node.
- Details: Indicates the amount of memory that is available for use by processes on the node, in bytes. This includes free memory and reclaimable memory from caches and buffers.
- node_filesystem_size_bytes:
- Description: Total size of the node's filesystem.
- Details: Represents the total capacity of the node's filesystem, in bytes.
- node_filesystem_free_bytes:
- Description: Free space in the node's filesystem.
- Details: Indicates the amount of unused space in the node's filesystem, in bytes.
- node_network_receive_bytes_total:
- Description: Total bytes received by the node.
- Details: Measures the total number of bytes received over all network interfaces on the node.
- node_network_transmit_bytes_total:
- Description: Total bytes transmitted by the node.
- Details: Measures the total number of bytes sent over all network interfaces on the node.
Some metrics are calculated and categorized as custom. There are two types of categories: those calculated for containers and those calculated for nodes.
- custom_node_cpu_usage_percentage
- custom_node_ram_available_mb
- custom_node_ram_total_mb
- custom_node_disk_usage_gb
- custom_node_disk_total_gb
- custom_node_network_receieve_mb
- custom_node_network_transmit_mb
- custom_service_cpu_usage
- custom_service_ram_usage_mb
- custom_service_disk_usage_mb
- custom_service_network_receive_mb
- custom_service_network_transmit_mb
Starometry communicates with the Healthcheck service from Protostar via NATS, where Starometry sends the latest scraped metrics to the Healthcheck.