Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Series limits in queries are difficult to understand #6373

Open
damnever opened this issue Nov 26, 2024 · 4 comments
Open

Series limits in queries are difficult to understand #6373

damnever opened this issue Nov 26, 2024 · 4 comments

Comments

@damnever
Copy link
Contributor

Describe the bug

Users have no knowledge of whether vertical sharding works for certain queries. However, since series limits are counted independently, sharded queries might have a lower likelihood of triggering the series limit compared to non-sharded queries. Another issue is that the ruler lacks vertical sharding capability, unlike the query-frontend.

To Reproduce

When vertical sharding is enabled, the following query might work fine:

sum by (cluster) (rate(api_total[5m]))

However, this query will fail with the error 'the query hit the max number of series limit':

rate(api_total[5m])
# or sum(rate(api_total[5m]))

Expected behavior

Series limits should be easy to understand and have a clear explanation about which level of the series they are limiting or counting.

Environment:

  • Infrastructure: [e.g., Kubernetes, bare-metal, laptop]
  • Deployment tool: [e.g., helm, jsonnet]

Additional Context

@yeya24
Copy link
Contributor

yeya24 commented Dec 6, 2024

Hi @damnever, thanks for the issue.

I agree it is something to improve. We have thought about having a global limit that count limit across shards instead of doing it per shard but that's not a trivial change.

Do you have any suggestions?

@damnever
Copy link
Contributor Author

damnever commented Dec 6, 2024

I personally prefer to use the existing one as the global limit, however, this might break compatibility.

@damnever
Copy link
Contributor Author

damnever commented Dec 20, 2024

@yeya24 another issue is that the series limit in the store-gateway is a simple sum of the block level counts(thanos code), not the unique series count since each block may contain a portion of the same series.

thanos-io/thanos#8011

@damnever
Copy link
Contributor Author

damnever commented Dec 20, 2024

@yeya24 another issue is that the series limit in the store-gateway is a simple sum of the block level counts(thanos code), not the unique series count since each block may contain a portion of the same series.

thanos-io/thanos#8011

It might hurt the performance, but a solution could be to compare the limit independently with each block without summing them, and then allow the querier to count the unique series.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants