Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] master from cortexproject:master #650

Merged
merged 6 commits into from
Oct 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,13 @@
* [ENHANCEMENT] Ingester: Add matchers to ingester LabelNames() and LabelNamesStream() RPC. #6209
* [ENHANCEMENT] Ingester/Store Gateway Clients: Introduce an experimental HealthCheck handler to quickly fail requests directed to unhealthy targets. #6225 #6257
* [ENHANCEMENT] Upgrade build image and Go version to 1.23.2. #6261 #6262
* [ENHANCEMENT] Querier/Ruler: Expose `store_gateway_consistency_check_max_attempts` for max retries when querying store gateway in consistency check. #6276
* [BUGFIX] Runtime-config: Handle absolute file paths when working directory is not / #6224

## 1.18.1 2024-10-14

* [BUGFIX] Backporting upgrade to go 1.22.7 to patch CVE-2024-34155, CVE-2024-34156, CVE-2024-34158 #6217 #6264

## 1.18.0 2024-09-03

* [CHANGE] Ingester: Remove `-querier.query-store-for-labels-enabled` flag. Querying long-term store for labels is always enabled. #5984
Expand Down
4 changes: 2 additions & 2 deletions RELEASE.md
Original file line number Diff line number Diff line change
Expand Up @@ -110,8 +110,8 @@ To publish a stable release:
1. Do not change the release branch directly; make a PR to the release-X.Y branch with VERSION and any CHANGELOG changes.
1. Ensure the `VERSION` file has **no** `-rc.X` suffix
1. Update the Cortex version in the following locations:
- Kubernetes manifests located at `k8s/`
- Documentation located at `docs/`
- `docs/getting-started/.env`
- Bump version in cortex-helm-chart via PR, for example https://github.com/cortexproject/cortex-helm-chart/pull/501
1. After merging your PR to release branch, `git tag` the new release (see [How to tag a release](#how-to-tag-a-release)) from release branch.
1. Wait until CI pipeline succeeded (once a tag is created, the release process through GitHub actions will be triggered for this tag)
1. Create a release in GitHub
Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
1.18.0
1.18.1
6 changes: 6 additions & 0 deletions docs/blocks-storage/querier.md
Original file line number Diff line number Diff line change
Expand Up @@ -226,6 +226,12 @@ querier:
# CLI flag: -querier.store-gateway-query-stats-enabled
[store_gateway_query_stats: <boolean> | default = true]

# The maximum number of times we attempt fetching missing blocks from
# different store-gateways. If no more store-gateways are left (ie. due to
# lower replication factor) than we'll end the retries earlier
# CLI flag: -querier.store-gateway-consistency-check-max-attempts
[store_gateway_consistency_check_max_attempts: <int> | default = 3]

# When distributor's sharding strategy is shuffle-sharding and this setting is
# > 0, queriers fetch in-memory series from the minimum set of required
# ingesters, selecting only ingesters which may have received series since
Expand Down
6 changes: 6 additions & 0 deletions docs/configuration/config-file-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -3872,6 +3872,12 @@ store_gateway_client:
# CLI flag: -querier.store-gateway-query-stats-enabled
[store_gateway_query_stats: <boolean> | default = true]

# The maximum number of times we attempt fetching missing blocks from different
# store-gateways. If no more store-gateways are left (ie. due to lower
# replication factor) than we'll end the retries earlier
# CLI flag: -querier.store-gateway-consistency-check-max-attempts
[store_gateway_consistency_check_max_attempts: <int> | default = 3]

# When distributor's sharding strategy is shuffle-sharding and this setting is >
# 0, queriers fetch in-memory series from the minimum set of required ingesters,
# selecting only ingesters which may have received series since 'now - lookback
Expand Down
9 changes: 9 additions & 0 deletions docs/configuration/single-process-config-blocks-gossip-1.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -96,3 +96,12 @@ ruler_storage:
backend: local
local:
directory: /tmp/cortex/rules

alertmanager:
external_url: http://localhost/alertmanager

alertmanager_storage:
backend: local
local:
# Make sure file exist
path: /tmp/cortex/alerts
9 changes: 9 additions & 0 deletions docs/configuration/single-process-config-blocks-gossip-2.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -95,3 +95,12 @@ ruler_storage:
backend: local
local:
directory: /tmp/cortex/rules

alertmanager:
external_url: http://localhost/alertmanager

alertmanager_storage:
backend: local
local:
# Make sure file exist
path: /tmp/cortex/alerts
9 changes: 9 additions & 0 deletions docs/configuration/single-process-config-blocks-local.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -88,3 +88,12 @@ ruler_storage:
backend: local
local:
directory: /tmp/cortex/rules

alertmanager:
external_url: http://localhost/alertmanager

alertmanager_storage:
backend: local
local:
# Make sure file exist
path: /tmp/cortex/alerts
9 changes: 9 additions & 0 deletions docs/configuration/single-process-config-blocks-tls.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -102,3 +102,12 @@ ruler_storage:
backend: local
local:
directory: /tmp/cortex/rules

alertmanager:
external_url: http://localhost/alertmanager

alertmanager_storage:
backend: local
local:
# Make sure file exist
path: /tmp/cortex/alerts
9 changes: 9 additions & 0 deletions docs/configuration/single-process-config-blocks.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -88,3 +88,12 @@ ruler_storage:
backend: local
local:
directory: /tmp/cortex/rules

alertmanager:
external_url: http://localhost/alertmanager

alertmanager_storage:
backend: local
local:
# Make sure file exist
path: /tmp/cortex/alerts
4 changes: 2 additions & 2 deletions docs/getting-started/.env
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
CORTEX_VERSION=v1.17.1
CORTEX_VERSION=v1.18.1
GRAFANA_VERSION=10.4.2
PROMETHEUS_VERSION=v2.51.2
SEAWEEDFS_VERSION=3.67
SEAWEEDFS_VERSION=3.67
1 change: 1 addition & 0 deletions integration/backward_compatibility_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@ var (
"quay.io/cortexproject/cortex:v1.17.0": nil,
"quay.io/cortexproject/cortex:v1.17.1": nil,
"quay.io/cortexproject/cortex:v1.18.0": nil,
"quay.io/cortexproject/cortex:v1.18.1": nil,
}
)

Expand Down
55 changes: 30 additions & 25 deletions pkg/querier/blocks_store_queryable.go
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,8 @@ type BlocksStoreQueryable struct {
metrics *blocksStoreQueryableMetrics
limits BlocksStoreLimits

storeGatewayQueryStatsEnabled bool
storeGatewayQueryStatsEnabled bool
storeGatewayConsistencyCheckMaxAttempts int

// Subservices manager.
subservices *services.Manager
Expand All @@ -153,8 +154,7 @@ func NewBlocksStoreQueryable(
finder BlocksFinder,
consistency *BlocksConsistencyChecker,
limits BlocksStoreLimits,
queryStoreAfter time.Duration,
storeGatewayQueryStatsEnabled bool,
config Config,
logger log.Logger,
reg prometheus.Registerer,
) (*BlocksStoreQueryable, error) {
Expand All @@ -164,16 +164,17 @@ func NewBlocksStoreQueryable(
}

q := &BlocksStoreQueryable{
stores: stores,
finder: finder,
consistency: consistency,
queryStoreAfter: queryStoreAfter,
logger: logger,
subservices: manager,
subservicesWatcher: services.NewFailureWatcher(),
metrics: newBlocksStoreQueryableMetrics(reg),
limits: limits,
storeGatewayQueryStatsEnabled: storeGatewayQueryStatsEnabled,
stores: stores,
finder: finder,
consistency: consistency,
queryStoreAfter: config.QueryStoreAfter,
logger: logger,
subservices: manager,
subservicesWatcher: services.NewFailureWatcher(),
metrics: newBlocksStoreQueryableMetrics(reg),
limits: limits,
storeGatewayQueryStatsEnabled: config.StoreGatewayQueryStatsEnabled,
storeGatewayConsistencyCheckMaxAttempts: config.StoreGatewayConsistencyCheckMaxAttempts,
}

q.Service = services.NewBasicService(q.starting, q.running, q.stopping)
Expand Down Expand Up @@ -264,7 +265,7 @@ func NewBlocksStoreQueryableFromConfig(querierCfg Config, gatewayCfg storegatewa
reg,
)

return NewBlocksStoreQueryable(stores, finder, consistency, limits, querierCfg.QueryStoreAfter, querierCfg.StoreGatewayQueryStatsEnabled, logger, reg)
return NewBlocksStoreQueryable(stores, finder, consistency, limits, querierCfg, logger, reg)
}

func (q *BlocksStoreQueryable) starting(ctx context.Context) error {
Expand Down Expand Up @@ -299,16 +300,17 @@ func (q *BlocksStoreQueryable) Querier(mint, maxt int64) (storage.Querier, error
}

return &blocksStoreQuerier{
minT: mint,
maxT: maxt,
finder: q.finder,
stores: q.stores,
metrics: q.metrics,
limits: q.limits,
consistency: q.consistency,
logger: q.logger,
queryStoreAfter: q.queryStoreAfter,
storeGatewayQueryStatsEnabled: q.storeGatewayQueryStatsEnabled,
minT: mint,
maxT: maxt,
finder: q.finder,
stores: q.stores,
metrics: q.metrics,
limits: q.limits,
consistency: q.consistency,
logger: q.logger,
queryStoreAfter: q.queryStoreAfter,
storeGatewayQueryStatsEnabled: q.storeGatewayQueryStatsEnabled,
storeGatewayConsistencyCheckMaxAttempts: q.storeGatewayConsistencyCheckMaxAttempts,
}, nil
}

Expand All @@ -328,6 +330,9 @@ type blocksStoreQuerier struct {
// If enabled, query stats of store gateway requests will be logged
// using `info` level.
storeGatewayQueryStatsEnabled bool

// The maximum number of times we attempt fetching missing blocks from different Store Gateways.
storeGatewayConsistencyCheckMaxAttempts int
}

// Select implements storage.Querier interface.
Expand Down Expand Up @@ -534,7 +539,7 @@ func (q *blocksStoreQuerier) queryWithConsistencyCheck(ctx context.Context, logg
retryableError error
)

for attempt := 1; attempt <= maxFetchSeriesAttempts; attempt++ {
for attempt := 1; attempt <= q.storeGatewayConsistencyCheckMaxAttempts; attempt++ {
// Find the set of store-gateway instances having the blocks. The exclude parameter is the
// map of blocks queried so far, with the list of store-gateway addresses for each block.
clients, err := q.stores.GetClientsFor(userID, remainingBlocks, attemptedBlocks, attemptedBlocksZones)
Expand Down
11 changes: 10 additions & 1 deletion pkg/querier/blocks_store_queryable_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -1552,6 +1552,8 @@ func TestBlocksStoreQuerier_Select(t *testing.T) {
logger: log.NewNopLogger(),
metrics: newBlocksStoreQueryableMetrics(reg),
limits: testData.limits,

storeGatewayConsistencyCheckMaxAttempts: 3,
}

matchers := []*labels.Matcher{
Expand Down Expand Up @@ -2148,6 +2150,8 @@ func TestBlocksStoreQuerier_Labels(t *testing.T) {
logger: log.NewNopLogger(),
metrics: newBlocksStoreQueryableMetrics(reg),
limits: &blocksStoreLimitsMock{},

storeGatewayConsistencyCheckMaxAttempts: 3,
}

if testFunc == "LabelNames" {
Expand Down Expand Up @@ -2371,7 +2375,12 @@ func TestBlocksStoreQuerier_PromQLExecution(t *testing.T) {
}

// Instance the querier that will be executed to run the query.
queryable, err := NewBlocksStoreQueryable(stores, finder, NewBlocksConsistencyChecker(0, 0, logger, nil), &blocksStoreLimitsMock{}, 0, false, logger, nil)
cfg := Config{
QueryStoreAfter: 0,
StoreGatewayQueryStatsEnabled: false,
StoreGatewayConsistencyCheckMaxAttempts: 3,
}
queryable, err := NewBlocksStoreQueryable(stores, finder, NewBlocksConsistencyChecker(0, 0, logger, nil), &blocksStoreLimitsMock{}, cfg, logger, nil)
require.NoError(t, err)
require.NoError(t, services.StartAndAwaitRunning(context.Background(), queryable))
defer services.StopAndAwaitTerminated(context.Background(), queryable) // nolint:errcheck
Expand Down
9 changes: 9 additions & 0 deletions pkg/querier/querier.go
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,9 @@ type Config struct {
StoreGatewayClient ClientConfig `yaml:"store_gateway_client"`
StoreGatewayQueryStatsEnabled bool `yaml:"store_gateway_query_stats"`

// The maximum number of times we attempt fetching missing blocks from different Store Gateways.
StoreGatewayConsistencyCheckMaxAttempts int `yaml:"store_gateway_consistency_check_max_attempts"`

ShuffleShardingIngestersLookbackPeriod time.Duration `yaml:"shuffle_sharding_ingesters_lookback_period"`

// Experimental. Use https://github.com/thanos-io/promql-engine rather than
Expand All @@ -94,6 +97,7 @@ var (
errShuffleShardingLookbackLessThanQueryStoreAfter = errors.New("the shuffle-sharding lookback period should be greater or equal than the configured 'query store after'")
errEmptyTimeRange = errors.New("empty time range")
errUnsupportedResponseCompression = errors.New("unsupported response compression. Supported compression 'gzip' and '' (disable compression)")
errInvalidConsistencyCheckAttempts = errors.New("store gateway consistency check max attempts should be greater or equal than 1")
)

// RegisterFlags adds the flags required to config this to the given FlagSet.
Expand Down Expand Up @@ -122,6 +126,7 @@ func (cfg *Config) RegisterFlags(f *flag.FlagSet) {
f.StringVar(&cfg.ActiveQueryTrackerDir, "querier.active-query-tracker-dir", "./active-query-tracker", "Active query tracker monitors active queries, and writes them to the file in given directory. If Cortex discovers any queries in this log during startup, it will log them to the log file. Setting to empty value disables active query tracker, which also disables -querier.max-concurrent option.")
f.StringVar(&cfg.StoreGatewayAddresses, "querier.store-gateway-addresses", "", "Comma separated list of store-gateway addresses in DNS Service Discovery format. This option should be set when using the blocks storage and the store-gateway sharding is disabled (when enabled, the store-gateway instances form a ring and addresses are picked from the ring).")
f.BoolVar(&cfg.StoreGatewayQueryStatsEnabled, "querier.store-gateway-query-stats-enabled", true, "If enabled, store gateway query stats will be logged using `info` log level.")
f.IntVar(&cfg.StoreGatewayConsistencyCheckMaxAttempts, "querier.store-gateway-consistency-check-max-attempts", maxFetchSeriesAttempts, "The maximum number of times we attempt fetching missing blocks from different store-gateways. If no more store-gateways are left (ie. due to lower replication factor) than we'll end the retries earlier")
f.DurationVar(&cfg.LookbackDelta, "querier.lookback-delta", 5*time.Minute, "Time since the last sample after which a time series is considered stale and ignored by expression evaluations.")
f.DurationVar(&cfg.ShuffleShardingIngestersLookbackPeriod, "querier.shuffle-sharding-ingesters-lookback-period", 0, "When distributor's sharding strategy is shuffle-sharding and this setting is > 0, queriers fetch in-memory series from the minimum set of required ingesters, selecting only ingesters which may have received series since 'now - lookback period'. The lookback period should be greater or equal than the configured 'query store after' and 'query ingesters within'. If this setting is 0, queriers always query all ingesters (ingesters shuffle sharding on read path is disabled).")
f.BoolVar(&cfg.ThanosEngine, "querier.thanos-engine", false, "Experimental. Use Thanos promql engine https://github.com/thanos-io/promql-engine rather than the Prometheus promql engine.")
Expand All @@ -148,6 +153,10 @@ func (cfg *Config) Validate() error {
}
}

if cfg.StoreGatewayConsistencyCheckMaxAttempts < 1 {
return errInvalidConsistencyCheckAttempts
}

return nil
}

Expand Down
Loading
Loading