-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: skip rate limiting of snapshot events #1383
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
Size Change: +116 B (+0.01%) Total Size: 1.17 MB
ℹ️ View Unchanged
|
$snapshot_data: snapshotBuffer.data, | ||
$session_id: snapshotBuffer.sessionId, | ||
$window_id: snapshotBuffer.windowId, | ||
const clientRateLimitContext = this.instance.rateLimiter.clientRateLimitContext() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this would get us the global bucket?
What I don't want (for example) is someone has an error in their code and exception autocapture rate limits because they're throwing in a loop
And then we can't capture a session recording.
I think we need to have a separate bucket for $snapshot
events and a different rate limit for this - it's way chattier than other events
Ultimately I think we might want a separate bucket for errors but that can come with time.
I might be misunderstanding what's going on here ofc...
The benefit of client side rate limiting being event type aware would be it can pick the bucket and we don't need to change the recorder.
I'm picturing
public clientRateLimitContext(checkOnly = false, suffix: string): {
isRateLimited: boolean
remainingTokens: number
} {
// This is primarily to prevent runaway loops from flooding capture with millions of events for a single user.
// It's as much for our protection as theirs.
const now = new Date().getTime()
const bucket = this.instance.persistence?.get_property(CAPTURE_RATE_LIMIT + '/' + suffix) ?? {
tokens: this.captureEventsBurstLimit,
last: now,
}
or something equivalent to that so that we have a totally separate bucket.
equally you could make this endpoint aware so that it automatically has a different rate limit per endpoint
the other thing... this is pretty new... maybe the limits are wrong and that's the simplest fix is they should be a little higher so that replay doesn't trigger them
(Sorry brain dumping since i'll be afk for hours now :))
@@ -903,6 +909,7 @@ export class SessionRecording { | |||
_url: this.instance.requestRouter.endpointFor('api', this._endpoint), | |||
_noTruncate: true, | |||
_batchKey: SESSION_RECORDING_BATCH_KEY, | |||
skip_client_rate_limiting: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the short term i think we should just get this line in
i can't think of a problem caused by volume of snapshots that wasn't from a pre-batching version of the SDK
so i think we can skip client rate limiting for replay and then figure out a good solution with the pressure off
@pauldambra (replying to both your comments in one) I think the point about looping exceptions blocking recordings is fair and could be a potential consequence. I'd be wary of changing the rate limits because we've seen with recordings there's always been a bigger fish. If we increase the limits to a "sensible amount" there'll always be something just over that amount. In that sense I don't think the limits could be "wrong", more so the case that they can never be enough. Going to go with the simplest solution of just calling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:LGTM:
yep, rrweb is so wary of missing events that we want to step carefully here 🙈
Changes
Adaptation of #1353 based on feedback in #1353 (comment) and https://posthog.slack.com/archives/C03PB072FMJ/p1724078964640479
_captureSnapshot
clientRateLimitContext
is that it tracks one token which we can think of as representing the$snapshot
event_captureSnapshot
we can includeskip_client_rate_limiting
no matter how many individual snapshot events were created during the splitting of the buffer