Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add support for all Prometheus metric types + internal refactor. #80

Merged
merged 2 commits into from
Sep 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
## unreleased

* [FEATURE] add support for all metric types; deprecated --metric-count flag; --*-interval flags set to 0 means no change; added OpenMetrics support.

## 0.5.0 / 2024-09-15

* [FEATURE] add two new operation modes for metrics generation and cycling; gradual-change between min and max and double-halve #64
* (other changes, not captured due to lost 0.4.0 tag)

## 0.4.0 / 2022-03-08

Expand Down
7 changes: 5 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
# avalanche

Avalanche serves a text-based [Prometheus metrics](https://prometheus.io/docs/instrumenting/exposition_formats/) endpoint for load testing [Prometheus](https://prometheus.io/) and possibly other [OpenMetrics](https://github.com/OpenObservability/OpenMetrics) consumers.
Avalanche is a load-testing binary capable of generating metrics that can be either:

Avalanche also supports load testing for services accepting data via Prometheus remote_write API such as [Thanos](https://github.com/improbable-eng/thanos), [Cortex](https://github.com/weaveworks/cortex), [M3DB](https://m3db.github.io/m3/integrations/prometheus/), [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics/) and other services [listed here](https://prometheus.io/docs/operating/integrations/#remote-endpoints-and-storage).
* scraped via [Prometheus scrape formats](https://prometheus.io/docs/instrumenting/exposition_formats/) (including [OpenMetrics](https://github.com/OpenObservability/OpenMetrics)) endpoint.
* written via Prometheus Remote Write (v1 only for now) to a target endpoint.

This allows load testing services that can scrape (e.g. Prometheus, OpenTelemetry Collector and so), as well as, services accepting data via Prometheus remote_write API such as [Thanos](https://github.com/thanos-io/thanos), [Cortex](https://github.com/cortexproject/cortex), [M3DB](https://m3db.github.io/m3/integrations/prometheus/), [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics/) and other services [listed here](https://prometheus.io/docs/operating/integrations/#remote-endpoints-and-storage).

Metric names and unique series change over time to simulate series churn.

Expand Down
104 changes: 48 additions & 56 deletions cmd/avalanche.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,51 +14,29 @@
package main

import (
"context"
"crypto/tls"
"fmt"
"log"
"math/rand"
"net/http"
"os"
"strconv"
"sync"
"syscall"
"time"

"github.com/nelkinda/health-go"
"github.com/oklog/run"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promhttp"
"github.com/prometheus/common/version"
kingpin "gopkg.in/alecthomas/kingpin.v2"
"gopkg.in/alecthomas/kingpin.v2"

"github.com/prometheus-community/avalanche/metrics"
"github.com/prometheus-community/avalanche/pkg/download"
)

var (
metricCount = kingpin.Flag("metric-count", "Number of metrics to serve.").Default("500").Int()
labelCount = kingpin.Flag("label-count", "Number of labels per-metric.").Default("10").Int()
seriesCount = kingpin.Flag("series-count", "Number of series per-metric.").Default("100").Int()
seriesChangeRate = kingpin.Flag("series-change-rate", "The rate at which the number of active series changes over time. Applies to 'gradual-change' mode.").Default("100").Int()
maxSeriesCount = kingpin.Flag("max-series-count", "Maximum number of series to serve. Applies to 'gradual-change' mode.").Default("1000").Int()
minSeriesCount = kingpin.Flag("min-series-count", "Minimum number of series to serve. Applies to 'gradual-change' mode.").Default("100").Int()
spikeMultiplier = kingpin.Flag("spike-multiplier", "Multiplier for the spike mode.").Default("1.5").Float64()
metricLength = kingpin.Flag("metricname-length", "Modify length of metric names.").Default("5").Int()
labelLength = kingpin.Flag("labelname-length", "Modify length of label names.").Default("5").Int()
constLabels = kingpin.Flag("const-label", "Constant label to add to every metric. Format is labelName=labelValue. Flag can be specified multiple times.").Strings()
valueInterval = kingpin.Flag("value-interval", "Change series values every {interval} seconds.").Default("30").Int()
labelInterval = kingpin.Flag("series-interval", "Change series_id label values every {interval} seconds.").Default("60").Int()
metricInterval = kingpin.Flag("metric-interval", "Change __name__ label values every {interval} seconds.").Default("120").Int()
seriesChangeInterval = kingpin.Flag("series-change-interval", "Change the number of series every {interval} seconds. Applies to 'gradual-change', 'double-halve' and 'spike' modes.").Default("30").Int()
seriesOperationMode = kingpin.Flag("series-operation-mode", "Mode of operation: 'gradual-change', 'double-halve', 'spike'").Default("default").String()
port = kingpin.Flag("port", "Port to serve at").Default("9001").Int()
remoteURL = kingpin.Flag("remote-url", "URL to send samples via remote_write API.").URL()
remotePprofURLs = kingpin.Flag("remote-pprof-urls", "a list of urls to download pprofs during the remote write: --remote-pprof-urls=http://127.0.0.1:10902/debug/pprof/heap --remote-pprof-urls=http://127.0.0.1:10902/debug/pprof/profile").URLList()
remotePprofInterval = kingpin.Flag("remote-pprof-interval", "how often to download pprof profiles.When not provided it will download a profile once before the end of the test.").Duration()
remoteBatchSize = kingpin.Flag("remote-batch-size", "how many samples to send with each remote_write API request.").Default("2000").Int()
remoteRequestCount = kingpin.Flag("remote-requests-count", "how many requests to send in total to the remote_write API.").Default("100").Int()
remoteReqsInterval = kingpin.Flag("remote-write-interval", "delay between each remote write request.").Default("100ms").Duration()
remoteTenant = kingpin.Flag("remote-tenant", "Tenant ID to include in remote_write send").Default("0").String()
tlsClientInsecure = kingpin.Flag("tls-client-insecure", "Skip certificate check on tls connection").Default("false").Bool()
remoteTenantHeader = kingpin.Flag("remote-tenant-header", "Tenant ID to include in remote_write send. The default, is the default tenant header expected by Cortex.").Default("X-Scope-OrgID").String()
outOfOrder = kingpin.Flag("out-of-order", "Enable out-of-order timestamps in remote write requests").Default("true").Bool()
)

func main() {
kingpin.Version(version.Print("avalanche"))
log.SetFlags(log.Ltime | log.Lshortfile) // Show file name and line in logs.
Expand All @@ -84,27 +62,35 @@ func main() {
" then returns it to the original count on the next tick. This pattern repeats indefinitely,\n" +
" creating a spiking effect in the series count.\n"

cfg := metrics.NewConfigFromFlags(kingpin.Flag)
port := kingpin.Flag("port", "Port to serve at").Default("9001").Int()
remoteURL := kingpin.Flag("remote-url", "URL to send samples via remote_write API.").URL()
// TODO(bwplotka): Kill pprof feature, you can install OSS continuous profiling easily instead.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree!

remotePprofURLs := kingpin.Flag("remote-pprof-urls", "a list of urls to download pprofs during the remote write: --remote-pprof-urls=http://127.0.0.1:10902/debug/pprof/heap --remote-pprof-urls=http://127.0.0.1:10902/debug/pprof/profile").URLList()
remotePprofInterval := kingpin.Flag("remote-pprof-interval", "how often to download pprof profiles. When not provided it will download a profile once before the end of the test.").Duration()
remoteBatchSize := kingpin.Flag("remote-batch-size", "how many samples to send with each remote_write API request.").Default("2000").Int()
remoteRequestCount := kingpin.Flag("remote-requests-count", "how many requests to send in total to the remote_write API.").Default("100").Int()
remoteReqsInterval := kingpin.Flag("remote-write-interval", "delay between each remote write request.").Default("100ms").Duration()
remoteTenant := kingpin.Flag("remote-tenant", "Tenant ID to include in remote_write send").Default("0").String()
tlsClientInsecure := kingpin.Flag("tls-client-insecure", "Skip certificate check on tls connection").Default("false").Bool()
remoteTenantHeader := kingpin.Flag("remote-tenant-header", "Tenant ID to include in remote_write send. The default, is the default tenant header expected by Cortex.").Default("X-Scope-OrgID").String()
bwplotka marked this conversation as resolved.
Show resolved Hide resolved
// TODO(bwplotka): Make this a non-bool flag (e.g. out-of-order-min-time).
outOfOrder := kingpin.Flag("out-of-order", "Enable out-of-order timestamps in remote write requests").Default("true").Bool()

kingpin.Parse()
if *maxSeriesCount <= *minSeriesCount {
fmt.Fprintf(os.Stderr, "Error: --max-series-count (%d) must be greater than --min-series-count (%d)\n", *maxSeriesCount, *minSeriesCount)
os.Exit(1)
}
if *minSeriesCount < 0 {
fmt.Fprintf(os.Stderr, "Error: --min-series-count must be 0 or higher, got %d\n", *minSeriesCount)
os.Exit(1)
}
if *seriesChangeRate <= 0 {
fmt.Fprintf(os.Stderr, "Error: --series-change-rate must be greater than 0, got %d\n", *seriesChangeRate)
os.Exit(1)
if err := cfg.Validate(); err != nil {
kingpin.FatalUsage("configuration error: %v", err)
}

stop := make(chan struct{})
defer close(stop)
updateNotify, err := metrics.RunMetrics(*metricCount, *labelCount, *seriesCount, *seriesChangeRate, *maxSeriesCount, *minSeriesCount, *metricLength, *labelLength, *valueInterval, *labelInterval, *metricInterval, *seriesChangeInterval, *spikeMultiplier, *seriesOperationMode, *constLabels, stop)
if err != nil {
log.Fatal(err)
}
collector := metrics.NewCollector(*cfg)
reg := prometheus.NewRegistry()
reg.MustRegister(collector)

var g run.Group
g.Add(run.SignalHandler(context.Background(), os.Interrupt, syscall.SIGTERM))
g.Add(collector.Run, collector.Stop)

// One-off remote write send mode.
if *remoteURL != nil {
if (**remoteURL).Host == "" || (**remoteURL).Scheme == "" {
log.Fatal("remote host and scheme can't be empty")
Expand All @@ -118,7 +104,7 @@ func main() {
RequestInterval: *remoteReqsInterval,
BatchSize: *remoteBatchSize,
RequestCount: *remoteRequestCount,
UpdateNotify: updateNotify,
UpdateNotify: collector.UpdateNotifyCh(),
Tenant: *remoteTenant,
TLSClientConfig: tls.Config{
InsecureSkipVerify: *tlsClientInsecure,
Expand Down Expand Up @@ -158,11 +144,10 @@ func main() {
wg.Done()
}
}()

}

// First cut: just send the metrics once then exit
err := metrics.SendRemoteWrite(config)
if err != nil {
if err := metrics.SendRemoteWrite(config, reg); err != nil {
log.Fatal(err)
}
if *remotePprofInterval > 0 {
Expand All @@ -172,9 +157,16 @@ func main() {
return
}

fmt.Printf("Serving your metrics at localhost:%v/metrics\n", *port)
err = metrics.ServeMetrics(*port)
if err != nil {
log.Fatal(err)
}
// Standard mode for continuous exposure of metrics.
httpSrv := &http.Server{Addr: fmt.Sprintf(":%v", *port)}
g.Add(func() error {
fmt.Printf("Serving your metrics at :%v/metrics\n", *port)
http.Handle("/metrics", promhttp.HandlerFor(reg, promhttp.HandlerOpts{
EnableOpenMetrics: true,
}))
bwplotka marked this conversation as resolved.
Show resolved Hide resolved
http.HandleFunc("/health", health.New(health.Health{}).Handler)
return httpSrv.ListenAndServe()
}, func(_ error) {
_ = httpSrv.Shutdown(context.Background())
})
}
1 change: 1 addition & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ require (
github.com/gogo/protobuf v1.3.2
github.com/golang/snappy v0.0.4
github.com/nelkinda/health-go v0.0.1
github.com/oklog/run v1.1.0
github.com/prometheus/client_golang v1.20.3
github.com/prometheus/client_model v0.6.1
github.com/prometheus/common v0.57.0
Expand Down
2 changes: 2 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,8 @@ github.com/nelkinda/health-go v0.0.1/go.mod h1:oNvFVrveHIH/xPW5DqjFfdtlyhLXHFmNz
github.com/nelkinda/http-go v0.0.1 h1:RL3RttZzzs/kzQaVPmtv5dxMaWValqCqGNjCXyjPI1k=
github.com/nelkinda/http-go v0.0.1/go.mod h1:DxPiZGVufTVSeO63nmVR5QO01TmSC0HHtEIZTHL5QEk=
github.com/niemeyer/pretty v0.0.0-20200227124842-a10e7caefd8e/go.mod h1:zD1mROLANZcx1PVRCS0qkT7pwLkGfwJo4zjcN/Tysno=
github.com/oklog/run v1.1.0 h1:GEenZ1cK0+q0+wsJew9qUg/DyD8k3JzYsZAi5gYi2mA=
github.com/oklog/run v1.1.0/go.mod h1:sVPdnTZT1zYwAJeCMu2Th4T21pA3FPOQRfWjQlk7DVU=
github.com/pelletier/go-toml v1.4.0/go.mod h1:PN7xzY2wHTK0K9p34ErDQMlFxa51Fk0OUruD3k1mMwo=
github.com/pkg/errors v0.8.0/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
github.com/pkg/errors v0.8.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
Expand Down
Loading
Loading