You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
there is definitely a problem here. Last i checked, the retention policy is executed on a strict cron-like schedule. If many indexes share the same schedule frequency, they would all run at once (technically, one after the other in quick succession, as fast as possible). Right now based on airmail logs, it seems we run roughly 20k retention policies all at once.
we also seem to execute all GC calls at once, but scoping them by index, which causes many consecutive call, and much more often (every 10 or so minutes). That's something that can also be improved upon
In #5346 we have spotted that our implementation of delete index was too aggressive.
For airmail, their internal job deleting a large number of indexes ended up hammering the metastore, hence disrupting indexing.
We want to make sure that we don't have a similar pattern in the janitor. In particular, when running the retention policy.
The text was updated successfully, but these errors were encountered: