-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
【腾讯犀牛鸟开源课题实战】prometheus插件专项建设(PUSH模式支持等) #175
Changes from 22 commits
809dfe0
ccfc856
8a1de0e
c879973
e1916f8
b980723
7f42fbd
cd24612
816bc28
3d907d6
2309769
1a5bb30
8cb45d4
a931cfc
3a9fe81
35ef136
f87a954
973bf4b
18dc6d7
919b8b3
c986e4f
12afca9
16b5b8c
078f31d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -57,6 +57,11 @@ plugins: | |
const_labels: | ||
key1: value1 | ||
key2: value2 | ||
auth_cfg: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 鉴权配置可以不用在这里说,放在单独的章节里就好了,一般用户会直接拷贝配置,默认不开启鉴权的场景,加这些配置会出错 |
||
iss: admin | ||
sub: prometheus-pull | ||
aud: trpc-server | ||
secret: test | ||
``` | ||
|
||
配置项说明: | ||
|
@@ -65,6 +70,7 @@ plugins: | |
| ------ | ------ | ------ | ------ | | ||
| histogram_module_cfg | 序列(Sequences) | 否,默认为 [1, 10, 100, 1000] | 模调监控耗时分布的统计区间,单位为ms | | ||
| const_labels | 映射(Mappings) | 否,默认为空 | 每个RPC统计数据默认附带的标签 | | ||
| auth_cfg | 鉴权(Mappings) | 否,默认为空 | 鉴权相关参数配置 | | ||
|
||
## 模调监控上报 | ||
|
||
|
@@ -99,6 +105,7 @@ client: | |
| aApp | 主调app名 | | ||
| aServer | 主调server名 | | ||
| aService | 主调service名 | | ||
| aIp | 主调ip地址 | | ||
| pApp | 被调app名 | | ||
| pServer | 被调server名 | | ||
| pService | 被调service名 | | ||
|
@@ -109,6 +116,7 @@ client: | |
| frame_ret_code | 调用的框架错误码 | | ||
| interface_ret_code | 调用的接口错误码 | | ||
|
||
|
||
### 被调监控上报 | ||
|
||
只需在框架配置文件的 `server` 中加上 `prometheus` 拦截器,即可开启被调监控: | ||
|
@@ -124,7 +132,6 @@ server: | |
|
||
统计数据: | ||
|
||
```mermaid | ||
| 监控名 | 监控类型 | 说明 | | ||
| ------ | ------ | ------ | | ||
| rpc_server_counter_metric | Counter | 服务端收到的请求总次数 | | ||
|
@@ -149,7 +156,7 @@ server: | |
| pConSetId | 被调所属set | | ||
| frame_ret_code | 调用的框架错误码 | | ||
| interface_ret_code | 调用的接口错误码 | | ||
``` | ||
|
||
|
||
## 属性监控上报 | ||
|
||
|
@@ -314,7 +321,7 @@ single_metrics_info.single_attr_info.value = 1; | |
|
||
#### 通用多维属性上报 | ||
|
||
Prometheus 监控插件支持框架通用的多维属性上报方式,即通过构造 `::trpc::TrpcMultiAttrMetricsInfo` 然后使用`::trpc::metrics::MultiAttrReport`接口来上报。**Prometheus 的单维属性上报是指上报统计标签包含多个键值对的数据。**。 | ||
Prometheus 监控插件支持框架通用的多维属性上报方式,即通过构造 `::trpc::TrpcMultiAttrMetricsInfo` 然后使用`::trpc::metrics::MultiAttrReport`接口来上报。**Prometheus 的多维属性上报是指上报统计标签包含多个键值对的数据。**。 | ||
|
||
设置 `::trpc::TrpcMultiAttrMetricsInfo` 值需要注意: | ||
|
||
|
@@ -398,3 +405,96 @@ std::vector<::prometheus::MetricFamily> Collect(); | |
## 通过 admin 获取 | ||
|
||
如果服务开启了 [admin 功能](./admin_service.md),则可以通过访问 `http://admin_ip:admin_port/metrics` 获取序列化为字符串后的 Prometheus 数据。 | ||
|
||
# 鉴权 | ||
|
||
Prometheus插件鉴权分为两种模式:pull模式 和 push模式,不同模式下的配置方式有所区别。 | ||
|
||
## pull模式 | ||
|
||
在pull模式下,使用Json Web Token(JWT)方式来鉴权。需要同时配置**trpc的Prometheus插件**和**Prometheus服务器**。 | ||
|
||
### 插件配置 | ||
|
||
插件配置样例如下: | ||
|
||
```yaml | ||
plugins: | ||
metrics: | ||
prometheus: | ||
auth_cfg: | ||
iss: admin # issuer 签发人 | ||
sub: prometheus-pull # subject 主题 | ||
aud: trpc-server # audience 受众 | ||
secret: test # 密钥 | ||
``` | ||
|
||
### Prometheus服务器配置 | ||
|
||
```yaml | ||
global: | ||
scrape_interval: 15s | ||
evaluation_interval: 15s | ||
|
||
scrape_configs: | ||
- job_name: trpc-cpp | ||
static_configs: | ||
- targets: ['127.0.0.1:8889'] | ||
labels: | ||
instance: trpc-cpp | ||
bearer_token: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJwcm9tZXRoZXVzLXB1bGwiLCJhdWQiOiJ0cnBjLXNlcnZlciIsImlzcyI6ImFkbWluIiwiaWF0IjoxNTE2MjM5MDIyfQ.WWYY3jgxelzAXzX0IJSmZUeQeqb5YLV4oBAO7FTUI5o | ||
``` | ||
|
||
需要配置**bearer_token**字段,该token可以通过[JWT官方工具](https://jwt.io/)生成。在payload中填写相应的iss,sub和aud字段,verify signature中填写secret字段,加密算法使用默认的 HS256。 | ||
|
||
## push模式 | ||
|
||
在push模式下,为了和pushgateway兼容,鉴权使用**username**和**password**的形式。 | ||
|
||
### 插件配置 | ||
|
||
插件配置样例如下: | ||
|
||
```yaml | ||
plugins: | ||
metrics: | ||
prometheus: | ||
auth_cfg: | ||
username: admin | ||
password: test | ||
``` | ||
|
||
### Prometheus服务器配置 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. prometheus怎么配置不用在这里讲,一般开启鉴权的用户都已经配过了 |
||
|
||
```yaml | ||
global: | ||
scrape_interval: 15s | ||
evaluation_interval: 15s | ||
|
||
scrape_configs: | ||
- job_name: pushgateway | ||
static_configs: | ||
- targets: ['172.17.0.5:9091'] # pushgateway服务器的地址 | ||
labels: | ||
instance: pushgateway_instance | ||
basic_auth: | ||
username: admin | ||
password: test | ||
``` | ||
|
||
### Pushgateway服务器配置 | ||
|
||
需要在Pushgateway服务器启动时,通过带有通过**bcrypt**加密的密文的配置文件启动。Pushgateway启动的配置文件如下: | ||
|
||
```yaml | ||
basic_auth_users: | ||
admin: $2b$12$kXxrZP74Fmjh6Wih0Ignu.uWSiojl5aKj4UnMvHN9s2h/Lc/ui0.S | ||
``` | ||
|
||
密码的密文可以通过htpasswd工具生成: | ||
```shell | ||
> htpasswd -nbB admin test | ||
admin:$2y$05$5uq4H5p8JyfQm.e16o3xduW6tkI2bTRpArTK4MF4dEuvncpz/bqy. | ||
``` | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -44,3 +44,13 @@ cc_library( | |
"@trpc_cpp//trpc/metrics/prometheus:prometheus_metrics_api", | ||
], | ||
) | ||
|
||
cc_binary( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 不需要push这个文件,去掉与之相关的编译引入 |
||
name = "push", | ||
srcs = ["push.cc"], | ||
deps = [ | ||
"@trpc_cpp//trpc/metrics/prometheus:prometheus_metrics_api", | ||
"@trpc_cpp//trpc/log:trpc_log", | ||
|
||
], | ||
) |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -77,6 +77,12 @@ ::trpc::Status ForwardServiceImpl::Route(::trpc::ServerContextPtr context, | |
"counter_name", "counter_desc", {{"const_counter_key", "const_counter_value"}}); | ||
::prometheus::Counter& counter = counter_family->Add({{"counter_key", "counter_value"}}); | ||
counter.Increment(random_num); | ||
|
||
if (::trpc::prometheus::PushMetricsInfo()) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 为啥这里还需要手动调用呢?不能配置一下yaml文件就生效吗? |
||
TRPC_FMT_INFO("Successfully pushed metrics to Pushgateway"); | ||
} else { | ||
TRPC_FMT_ERROR("Failed to push metrics to Pushgateway"); | ||
} | ||
#endif | ||
|
||
auto client_context = ::trpc::MakeClientContext(context, greeter_proxy_); | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
#include <chrono> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 这个文件和框架无关,没必要增加,用法放在文档就好了 |
||
#include <thread> | ||
#include "trpc/metrics/prometheus/prometheus_metrics_api.h" | ||
#include "trpc/log/trpc_log.h" | ||
|
||
|
||
|
||
int main(int argc, char** argv) { | ||
|
||
while (true) { | ||
if (::trpc::prometheus::PushMetricsInfo()) | ||
{ | ||
std::cout << "Successfully pushed metrics to Pushgateway" << std::endl; | ||
} else { | ||
std::cerr << "Failed to push metrics to Pushgateway" << std::endl; | ||
} | ||
|
||
std::this_thread::sleep_for(std::chrono::seconds(5)); // 每60秒推送一次 | ||
} | ||
|
||
return 0; | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -44,6 +44,16 @@ plugins: | |
const_labels: | ||
const_key1: const_value1 | ||
const_key2: const_value2 | ||
push_mode: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 例子需要演示pull模式和push模式,应该给出2个文件配置 |
||
enabled: true | ||
gateway_url: "http://pushgateway:9091" | ||
job_name: "test_job" | ||
push_interval_seconds: 2 | ||
auth_cfg: | ||
iss: admin | ||
sub: prometheus-pull | ||
aud: trpc-server | ||
secret: test | ||
log: | ||
default: | ||
- name: default | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
package( | ||
default_visibility = ["//visibility:public"], | ||
) | ||
|
||
cc_library( | ||
name = "jwt-cpp", | ||
hdrs = glob(["**/*.h"]), | ||
deps = [ | ||
"@com_github_openssl_openssl//:libcrypto", | ||
"@com_github_openssl_openssl//:libssl", | ||
], | ||
) |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -20,10 +20,87 @@ namespace trpc::admin { | |
|
||
PrometheusHandler::PrometheusHandler() { description_ = "[GET /metrics] get prometheus metrics"; } | ||
|
||
void PrometheusHandler::Init() { | ||
PrometheusConfig prometheus_conf; | ||
bool ret = TrpcConfig::GetInstance()->GetPluginConfig<PrometheusConfig>( | ||
"metrics", trpc::prometheus::kPrometheusMetricsName, prometheus_conf); | ||
if (!ret) { | ||
TRPC_LOG_WARN( | ||
"Failed to obtain Prometheus plugin configuration from the framework configuration file. Default configuration " | ||
"will be used."); | ||
} | ||
auth_cfg_ = prometheus_conf.auth_cfg; | ||
} | ||
|
||
bool PrometheusHandler::CheckTokenAuth(std::string bearer_token) { | ||
auto splited = Split(bearer_token, ' '); | ||
if (splited.size() != 2) { | ||
TRPC_FMT_ERROR("error token: {}", bearer_token); | ||
return false; | ||
} | ||
auto method = splited[0]; | ||
if (method != "Bearer") { | ||
TRPC_FMT_ERROR("error auth method: {}", method); | ||
return false; | ||
} | ||
std::string token = std::string(splited[1]); | ||
if (!Jwt::isValid(token, auth_cfg_)) { | ||
TRPC_FMT_ERROR("error token: {}", token); | ||
return false; | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 构造函数做了太复杂的事情,可以定义一个Init函数,把这部分逻辑放在Init函数里 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 已修改。 |
||
return true; | ||
} | ||
|
||
bool PrometheusHandler::CheckBasicAuth(std::string token) { | ||
auto splited = Split(token, ' '); | ||
if (splited.size() != 2) { | ||
TRPC_FMT_ERROR("error token: {}", token); | ||
return false; | ||
} | ||
if (splited[0] != "Basic") { | ||
TRPC_FMT_ERROR("error token: {}", token); | ||
return false; | ||
} | ||
|
||
std::string username_pwd = http::Base64Decode(std::begin(splited[1]), std::end(splited[1])); | ||
auto sp = Split(username_pwd, ':'); | ||
if (sp.size() != 2) { | ||
TRPC_FMT_ERROR("error token: {}", token); | ||
return false; | ||
} | ||
|
||
auto username = sp[0], pwd = sp[1]; | ||
if (username != auth_cfg_["username"] || pwd != auth_cfg_["password"]) { | ||
TRPC_FMT_ERROR("error username or password: username: {}, password: {}", username, pwd); | ||
return false; | ||
} | ||
return true; | ||
} | ||
|
||
void PrometheusHandler::CommandHandle(http::HttpRequestPtr req, rapidjson::Value& result, | ||
rapidjson::Document::AllocatorType& alloc) { | ||
static std::unique_ptr<::prometheus::Serializer> serializer = std::make_unique<::prometheus::TextSerializer>(); | ||
|
||
std::string token = req->GetHeader("authorization"); | ||
|
||
if (!auth_cfg_.empty()) { | ||
if (auth_cfg_.count("username") && auth_cfg_.count("password")) { | ||
// push mode | ||
// use the basic auth if already config the username and password. | ||
if (!CheckBasicAuth(token)) { | ||
result.AddMember("message", "wrong request without right username or password", alloc); | ||
return; | ||
} | ||
} else { | ||
// pull mode | ||
// use the json web token auth. | ||
if (!CheckTokenAuth(token)) { | ||
result.AddMember("message", "wrong request without right token", alloc); | ||
return; | ||
} | ||
} | ||
} | ||
|
||
std::string prometheus_str = serializer->Serialize(trpc::prometheus::Collect()); | ||
result.AddMember(rapidjson::StringRef("trpc-html"), rapidjson::Value(prometheus_str, alloc).Move(), alloc); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
默认关闭prometheus,这行可以删掉