-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add prometheus.relabel external cache proposal #1600
base: main
Are you sure you want to change the base?
Add prometheus.relabel external cache proposal #1600
Conversation
92c70da
to
9f093a1
Compare
forward_to = [...] | ||
cache { | ||
backend = "redis" | ||
redis { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of a bespoke redis/memcached action, how about a redis and memcached component that exports a key value interface? Along with generic accessors. I could see this being used for more than just relabel. Much in the same way we allow components to consume strings but those strings can come from vault, file, http etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
storage.redis "relabel" {
url = "example.com"
}
prometheus.relabel "relabel" {
cache = storage.redis.relabel.cache
}
// Just for fun
prometheus.remote_write "mimir" {
endpoint {
external_labels = {
cluster = coalesce(storage.redis.relabel["cluster"],"unknown")
}
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could even get clever and allow prometheus.relabel to have a list of caches, if there are problems with redis then fallback to in memory.
prometheus.relabel "relabel" {
caches = [storage.redis.relabel.cache,storage.memory.relabel]
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitively like the idea of having a unified way to define the caching mecanism.
if there are problems with redis then fallback to in memory.
You are correct but this add an extra complexity to take into account. A highly available external cache is indeed recommended but having a fallback is definitively a good idea.
The extra complexity comes from when the external caches becomes unavailable Alloy needs to use the in-memory one. This could lead to OOM, e.g: you have plan to use an external cache then you fallback on in-memory but the limits of your pod is not high enough.
Then Alloy needs to check if the external storage is available again. Once it is, Alloy needs to move its in-memory k/v to the external cache. But we are exposed to a race when we have two Alloy instances using the same cache.
That being said, as I'm not that familiar with the code base yet and your proposal seems harder to achieve.
What do you think about extending our proposal to loki.relabel
and standardize the whole thing as prometheus and loki are the main component using cache?
This would serve as a first step toward your proposal. Something like so
type RelabelCache interface {
Get(key uint64) (*labelAndID, error)
Set(key uint64, value *labelAndID) error
Remove(key uint64)
Clear(newSize int) error
GetCacheSize() int
}
Once done, the associated work to create the storage
components such as storage.redis
could be picked up right after the initial proposal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fallback caches are likely a stretch though I do feel the kv store should be a generic component. As far as the interface I would go with something like the below. Disclaimer this is getting into the deeper technical ideas. The reason keys might want to be strings is to mirror how alloy uses maps. Where the key is always a string.
type Cache interface {
Get(key string) (any, bool)
Set(key string, value any, ttl int)
Remove(key uint64)
Clear(newSize int) error
GetCacheSize() int
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You shouldn't have to specifically use that for the first round. Capsules are mainly if you want to bridge into it being accessible within the alloy config. In this case your likely passing an interface/pointer from go code to go code, which should work without problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the proposal to take reflect what we have been discussing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think we will likely need GetMultiple(keys []string) ([]Results)
and SetMultiple
, imagine a scenario where you are scraping a million records, hitting redis each time is going to be pretty expensive at one request for each.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattdurham I've tried to take your feedbacks into account, the PR should be fine what do you think about it?
We have started to work on the code with @pbailhache, what would be the next step regarding this proposal? should we merge it if it is to your liking and iterate over the code?
p.s: I don't know what to do about the build 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just opened a draft pull request for the feature :
#1692
This is still a work in progress and I'm ready to iterate over it.
b1f8f74
to
8d68d6b
Compare
|
||
## Compatibility | ||
|
||
This proposal offer to deprecate for a couple of release the old way to configure the in-memory cache, then release+2 drop the old way. Doing so allow to migrate the settings to the correct block. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think our accepted backwards compatibility guarantees go against this. Any breaking changes for stable functionality would have to wait until v2.0.
Is there any reason for wanting to deprecate it? We can't control the pace of users adopting new versions, so IMO doing our best to avoid breaking their configurations and use cases would be the way to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want to deprecate anything sooner than what is guaranteed.
I will rephrase to say that we keep both compatibility until the next major release
This PR has not had any activity in the past 30 days, so the |
This is not inactive as we are iterating on an implementation in a distinct PR |
Signed-off-by: Wilfried Roset <[email protected]>
Signed-off-by: Wilfried Roset <[email protected]>
Signed-off-by: Wilfried Roset <[email protected]>
Signed-off-by: Wilfried Roset <[email protected]>
8684ddd
to
2a0a8a2
Compare
PR Description
This PR add a proposal to introduce the support for an external cache for
prometheus.relabel
component.Which issue(s) this PR fixes
N/A
Notes to the Reviewers