-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[receiver/k8seventsreceiver] support kubernetes leader election #17369
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping |
Hi @newly12, thanks for reporting the issue. The k8sevents receiver is likely will be deprecated in favor of k8sobjects receiver. The leader election mechanism is a pretty complicated addition to otel collector. The k8s receivers fetching data from the control plane (k8scluster and k8sobjects) are expected to be deployed as one replica deployment. Can you please elaborate on why it won't work for federated clusters? |
Pinging code owners for receiver/k8sevents: @dmitryax. See Adding Labels via Comments if you do not have permissions to add labels yourself. |
Hi @dmitryax it works for federated clusters as well. I guess the question to me is that whether we should, or does it make sense to support HA(or active-standby mode) in otel collector. In a regular cluster case, collector usually will be deployed as one replica deployment and run in the in-cluster mode, and there will be few seconds data loss during any rollout f.g. image upgrade. In federated cluster cases, currently collector can only be deployed in one sub cluster and configure the federated cluster API server address, service account token, etc either via configuration or env vars, in case of the sub cluster outage(network issue or something else), there will be data loss. I think for both cases it make sense to have leader election, like first case, it can be two replica deployment, and when the active pod is shutdown for upgrade, or evicted, the standby pod can start receiving events immediately to reduce the data loss as much as possible. and in second case, it makes more sense to have HA support to avoid single cluster failure. One available solution is the k8s controller runtime framework, which leverage a lease object(or configmap IIRC) for leader election. I was thinking maybe similar approach can be used, or leverage the framework, I haven't look into k8sevents receiver or k8sobjects receiver code yet, I can't tell how many changes are needed. |
Sounds good to me. I agree that having HA option is great. We need this functionality to be reused across all the receivers fetching data from the k8s cluster API. Currently, those are k8scluster, k8sevents, and k8sobjects receivers. Let me know if you have a chance to work on that. |
Thanks for the confirmation. I may not have bandwidth to work on this recently, how long could the issue remain open? |
Up to half a year. I can also add a label |
Sounds good, could you please add the tag? Thanks. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
@dmitryax I recently planned to replace our original kube-event-exporter component with |
@JaredTan95 @newly12 @dmitryax Any plan to implement the HA for k8sclusterreceiver. Another use case i can think of is we are running these workloads as daemon sets and seeing too many duplicate data being gathered. Also could you please suggest any workaround until then?. Thanks in advance. |
I think that after this PR get merged, we will be able to move forward with other receivers supporting leaderelection very quickly 😄 |
@JaredTan95 Any idea when this PR will be merged and available for users. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
With #24242 still very much a goal for the future of the |
The leader election mechanism should be reusable across any k8s components (maybe not even k8s). There is another issue for supporting this in the k8s cluster receiver, which is more appropriate: #24678. We need to keep at least one issue open regarding this. If anyone is interested, please feel free to help. |
I'd prefer we track the feature via the k8sobjectsreceiver instead of the k8seventreceiver. I've opened #32994. |
+1, So, It would be more appropriate for this issue to be closed in order to focus attention on |
FYI: A more generic solution got proposed with #34460 |
Component(s)
receiver/k8seventsreceiver
Is your feature request related to a problem? Please describe.
The use case is that we need to collector Federated k8s cluster events, however otel collector pods are deployed in sub cluster(s), the idea is to support leader election so pods can be deployed in multiple sub clusters and running in active-standby mode.
Describe the solution you'd like
Support kubernetes leader election
Describe alternatives you've considered
n/a
Additional context
No response
The text was updated successfully, but these errors were encountered: