-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: changes rate limiter #549
base: main
Are you sure you want to change the base?
Conversation
9e039f5
to
a7358dd
Compare
1309376
to
fe21c62
Compare
db/update.go
Outdated
@@ -276,7 +276,11 @@ func extractChanges(ctx api.ScrapeContext, result *v1.ScrapeResult, ci *models.C | |||
if changeResult.UpdateExisting { | |||
updates = append(updates, change) | |||
} else { | |||
newOnes = append(newOnes, change) | |||
if ok, err := ctx.TempCache().IsChangePersisted(change.ConfigID, change.ExternalChangeId); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to be extremely slow - If we keep the on conflict do nothing, and add a returning config_id we can just increment the rate limit counter after insertion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but the rate limit needs to happen before insertion.
We must know which changes shouldn't account for rate limits.
Maybe a one query to filter out existing changes should do the job.
defccec
to
879926a
Compare
db/changes.go
Outdated
query := `WITH latest_changes AS ( | ||
SELECT | ||
DISTINCT ON (config_id) config_id, | ||
change_type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rate limit should be by config id, not config id and type. - We can just doselect config_id, count(*) from config_changes where scraper_id = ? and created_at > now() - window
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rate limit is on config_id only.
This query finds all the configs of the given scraper whose latest change was "TooManyChanges".
db/changes.go
Outdated
// filterOutPersistedChanges returns only those changes that weren't seen in the db. | ||
func filterOutPersistedChanges(ctx api.ScrapeContext, changes []*models.ConfigChange) ([]*models.ConfigChange, error) { | ||
// use cache to filter out ones that we've already seen before | ||
changes = lo.Filter(changes, func(c *models.ConfigChange, _ int) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This cache would be too big, Why can't we just insert on conflict do nothing returning config_id
- That will return the config_id for changes that were inserted ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the max changes count is 10 per window and we insert a batch of 500 changes first, the rate limit would be disregarded totally.
The config_change cache isn't actualy caching the changes though. It's value is empty struct whose size=0.
879926a
to
70b0532
Compare
We don't want to create new config changes with a ON CONFLICT DO NOTHING clause because now we don't want the same config change to take up multiple quotas from the rate limiter. i.e. if an aws scraper is run @every 5m, we'll be trying to insert the same config changes generated by the cloudtrail scraper again & again on every run. The same change will take up one more quota from the rate limiter on every run. By knowing that the change already exist, we can avoid inserting that change in the first place and the rate limiter will be happy about it.
1957d50
to
284b39f
Compare
284b39f
to
5f43ea7
Compare
resolves: #530