-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[exporter/loadbalancing] Data loss occurring when change in downstream DNS records or pods #35378
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
We noticed similar issue# 35512 for metrics, we are using this otel lb since our custom exporter is stateful. Do you see any errors in |
@bvsvas our primary signal is the metric We have temporarily solved the issue by falling back to the static resolver, but it means we have to hardcode all the hostnames in: exporters:
loadbalancing:
protocol:
otlp:
sending_queue:
queue_size: 500
resolver:
static:
hostnames:
- pod-0.stateful-sampler.tailsampling:4317
- pod-1.stateful-sampler.tailsampling:4317
- pod-2.stateful-sampler.tailsampling:4317
- .... Would appreciate some thoughts from maintainers, @jpkrohling @MovieStoreGuy any ideas? I would be happy to take a closer look and contribute a fix, just would want some input from codeowners first. |
I'm not overly familiar with the loadbalancer code base but it sounds like the component meant to provide the discovery is failing to provide a refresh or is persisting old exporters. I can take a look some time this week if no one else has the time. |
Okay, I think I have somewhat deduce what the issues but it is unclear on what is the best solution going forward. So in the event that an load balanced exporter is removed from the internal ring, there is a chance that the following scenario has happened: flowchart LR
A[Prior Pipeline] --> B[Load Balance Exporter]
B --> C[Exporter #0]
B --> D[Exporter #1]
C --> E(((Endpoint #0)))
D --> F(((Endpoint #1)))
If flowchart LR
A[Prior Pipeline] --> B[Load Balance Exporter]
B --> C[Exporter #0]
B --> D[Exporter #1]
C --> E(((Endpoint #0)))
D --> F@{ shape: cross-circ }
There is a moment that that is still being routed to Now, this is is the part I am unsure of what is the best path forward since each idea I currently have comes with some amount of cons. Potential solution is that exporters with a Queue expose a method that allows work to be stolen from them effectively. This would mean that queue interface would need some means of being exposed so it could be redistributed in the exporter. Another solution would be for the load balancer exporter to implement its own queue system as a wrapper around the components that effective do the same on above. |
Thanks for the explanation @MovieStoreGuy, this certainly makes sense and it's been our suspicion for a while that this is exactly what's happening. This architecture doesn't allow for auto scaling (or any change to downstream IP addresses) without loss of data in the exporter queue. In my opinion, I think the way forward is to restructure the loadbalancing exporter so that it has it's own queue, and then each exporter should be a consumer of that queue. The"worker" exporters should not have their own queues, as this is where the data is lost. One consumption from the queue would include the routing of a batch and the successful export of all items in the batch, if it fails it stays in the top-level queue. Of course easier said than done, and I am by no means an expert on this code. I would be interested to see @jpkrohling's comment too. |
The understanding of this issue is correct. Originally, the load balancing exporter was developed with the idea of doing an A query to obtain the backends for a specific service name. Once the IP became unavailable, data meant for it would start failing. In ideal scenarios, the IP is first removed from the DNS, the change is picked up by the load balancer, and the internal ring is changed to reassign the trace IDs for that IP to another IP. In cases where IPs simply disappear, this means that we'd have data loss. Data loss was (is?) an acceptable compromise at the beginning for traces, as the assumption is that it would only happen in small quantities and, given enough traffic, similar traces would be available before or after the data loss event. This allowed us to have a load balancer that is horizontally scalable, without need for data sync among instances of the load balancer. Note that the load balancer's main job is to balance the trace IDs among backends in a consistent manner, so that multiple instances of the load balancer can make the same decision without coordination between themselves (which is the consistent hashing part of the solution). So, if an IP is not available anymore, it should be removed from all load balancer instances at about the same time, otherwise spans from the same trace will be routed to different backing collectors. |
@Fiery-Fenix thankyou for your contribution!! |
…e and timeout settings (open-telemetry#36094) #### Description ##### Problem statement `loadbalancing` exporter is actually a wrapper that's creates and manages set of actual `otlp` exporters Those `otlp` exporters technically shares same configuration parameters that are defined on `loadbalancing` exporter level, including `sending_queue` configuration. The only difference is `endpoint` parameter that are substituted by `loadbalancing` exporter itself This means, that `sending_queue`, `retry_on_failure` and `timeout` settings can be defined only on `otlp` sub-exporters, while top-level `loadbalancing` exporter is missing all those settings This configuration approach produces several issue, that are already reported by users: * Impossibility to use Persistent Queue in `loadbalancing` exporter (see open-telemetry#16826). That's happens because `otlp` sub-exporters are sharing the same configurations, including configuration of the queue, i.e. they all are using the same `storage` instance at the same time which is not possible at the moment * Data loss even using `sending_queue` configuration (see open-telemetry#35378). That's happens because Queue is defined on level of `otlp` sub-exporters and if this exporter cannot flush data from queue (for example, endpoint is not available anymore) there is no other options that just to discard data from queue, i.e. there is no higher level queue and persistent storage where data can be returned is case of permanent failure There might be some other potential issue that was already tracked and related to current configuration approach ##### Proposed solution The easiest way to solve issues above - is to use standard approach for queue, retry and timeout configuration using `exporterhelper` This will bring queue, retry and timeout functionality to the top-level of `loadbalancing` exporter, instead of `otlp` sub-exporters Related to mentioned issues it will bring: * Single Persistent Queue, that is used by all `otlp` sub-exporters (not directly of course) * Queue will not be discarded/destroyed if any (or all) of endpoint that are unreachable anymore, top-level queue will keep data until new endpoints will be available * Scale-up and scale-down event for next layer of OpenTelemetry Collectors in K8s environments will be more predictable, and will not include data loss anymore (potential fix for open-telemetry#33959). There is still a big chance of inconsistency when some data will be send to incorrect endpoint, but it's already better state that we have right now ##### Noticeable changes * `loadbalancing` exporter on top-level now uses `exporterhelper` with all supported functionality by it * `sending_queue` will be automatically disabled on `otlp` exporters when it already present on top-level `loadbalancing` exporter. This change is done to prevent data loss on `otlp` exporters because queue there doesn't provide expected result. Also it will prevent potential misconfiguration from user side and as result - irrelevant reported issues * `exporter` attribute for metrics generated from `otlp` sub-exporters now includes endpoint for better visibility and to segregate them from top-level `loadbalancing` exporter - was `"exporter": "loadbalancing"`, now `"exporter": "loadbalancing/127.0.0.1:4317"` * logs, generated by `otlp` sub-exporters now includes additional attribute `endpoint` with endpoint value with the same reasons as for metrics #### Link to tracking issue Fixes open-telemetry#35378 Fixes open-telemetry#16826 #### Testing Proposed changes was heavily tested on large K8s environment with set of different scale-up/scale-down event using persistent queue configuration - no data loss were detected, everything works as expected #### Documentation `README.md` was updated to reflect new configuration parameters available. Sample `config.yaml` was updated as well
…e and timeout settings (open-telemetry#36094) #### Description ##### Problem statement `loadbalancing` exporter is actually a wrapper that's creates and manages set of actual `otlp` exporters Those `otlp` exporters technically shares same configuration parameters that are defined on `loadbalancing` exporter level, including `sending_queue` configuration. The only difference is `endpoint` parameter that are substituted by `loadbalancing` exporter itself This means, that `sending_queue`, `retry_on_failure` and `timeout` settings can be defined only on `otlp` sub-exporters, while top-level `loadbalancing` exporter is missing all those settings This configuration approach produces several issue, that are already reported by users: * Impossibility to use Persistent Queue in `loadbalancing` exporter (see open-telemetry#16826). That's happens because `otlp` sub-exporters are sharing the same configurations, including configuration of the queue, i.e. they all are using the same `storage` instance at the same time which is not possible at the moment * Data loss even using `sending_queue` configuration (see open-telemetry#35378). That's happens because Queue is defined on level of `otlp` sub-exporters and if this exporter cannot flush data from queue (for example, endpoint is not available anymore) there is no other options that just to discard data from queue, i.e. there is no higher level queue and persistent storage where data can be returned is case of permanent failure There might be some other potential issue that was already tracked and related to current configuration approach ##### Proposed solution The easiest way to solve issues above - is to use standard approach for queue, retry and timeout configuration using `exporterhelper` This will bring queue, retry and timeout functionality to the top-level of `loadbalancing` exporter, instead of `otlp` sub-exporters Related to mentioned issues it will bring: * Single Persistent Queue, that is used by all `otlp` sub-exporters (not directly of course) * Queue will not be discarded/destroyed if any (or all) of endpoint that are unreachable anymore, top-level queue will keep data until new endpoints will be available * Scale-up and scale-down event for next layer of OpenTelemetry Collectors in K8s environments will be more predictable, and will not include data loss anymore (potential fix for open-telemetry#33959). There is still a big chance of inconsistency when some data will be send to incorrect endpoint, but it's already better state that we have right now ##### Noticeable changes * `loadbalancing` exporter on top-level now uses `exporterhelper` with all supported functionality by it * `sending_queue` will be automatically disabled on `otlp` exporters when it already present on top-level `loadbalancing` exporter. This change is done to prevent data loss on `otlp` exporters because queue there doesn't provide expected result. Also it will prevent potential misconfiguration from user side and as result - irrelevant reported issues * `exporter` attribute for metrics generated from `otlp` sub-exporters now includes endpoint for better visibility and to segregate them from top-level `loadbalancing` exporter - was `"exporter": "loadbalancing"`, now `"exporter": "loadbalancing/127.0.0.1:4317"` * logs, generated by `otlp` sub-exporters now includes additional attribute `endpoint` with endpoint value with the same reasons as for metrics #### Link to tracking issue Fixes open-telemetry#35378 Fixes open-telemetry#16826 #### Testing Proposed changes was heavily tested on large K8s environment with set of different scale-up/scale-down event using persistent queue configuration - no data loss were detected, everything works as expected #### Documentation `README.md` was updated to reflect new configuration parameters available. Sample `config.yaml` was updated as well
@jpkrohling @Fiery-Fenix @MovieStoreGuy I commented this on the PR but since it's closed I'll re-iterate. I don't think this is actually fixed.
The PR unfortunately needs more work for it to avoid data loss for us. Confirmed our production load cannot handle this, even with 500 consumers on the top level queue. With sub-level queue, data loss occurs on scaling. Without it, performance problems occur which cause data loss anyway. |
Will create a new issue for this |
…e and timeout settings (open-telemetry#36094) #### Description ##### Problem statement `loadbalancing` exporter is actually a wrapper that's creates and manages set of actual `otlp` exporters Those `otlp` exporters technically shares same configuration parameters that are defined on `loadbalancing` exporter level, including `sending_queue` configuration. The only difference is `endpoint` parameter that are substituted by `loadbalancing` exporter itself This means, that `sending_queue`, `retry_on_failure` and `timeout` settings can be defined only on `otlp` sub-exporters, while top-level `loadbalancing` exporter is missing all those settings This configuration approach produces several issue, that are already reported by users: * Impossibility to use Persistent Queue in `loadbalancing` exporter (see open-telemetry#16826). That's happens because `otlp` sub-exporters are sharing the same configurations, including configuration of the queue, i.e. they all are using the same `storage` instance at the same time which is not possible at the moment * Data loss even using `sending_queue` configuration (see open-telemetry#35378). That's happens because Queue is defined on level of `otlp` sub-exporters and if this exporter cannot flush data from queue (for example, endpoint is not available anymore) there is no other options that just to discard data from queue, i.e. there is no higher level queue and persistent storage where data can be returned is case of permanent failure There might be some other potential issue that was already tracked and related to current configuration approach ##### Proposed solution The easiest way to solve issues above - is to use standard approach for queue, retry and timeout configuration using `exporterhelper` This will bring queue, retry and timeout functionality to the top-level of `loadbalancing` exporter, instead of `otlp` sub-exporters Related to mentioned issues it will bring: * Single Persistent Queue, that is used by all `otlp` sub-exporters (not directly of course) * Queue will not be discarded/destroyed if any (or all) of endpoint that are unreachable anymore, top-level queue will keep data until new endpoints will be available * Scale-up and scale-down event for next layer of OpenTelemetry Collectors in K8s environments will be more predictable, and will not include data loss anymore (potential fix for open-telemetry#33959). There is still a big chance of inconsistency when some data will be send to incorrect endpoint, but it's already better state that we have right now ##### Noticeable changes * `loadbalancing` exporter on top-level now uses `exporterhelper` with all supported functionality by it * `sending_queue` will be automatically disabled on `otlp` exporters when it already present on top-level `loadbalancing` exporter. This change is done to prevent data loss on `otlp` exporters because queue there doesn't provide expected result. Also it will prevent potential misconfiguration from user side and as result - irrelevant reported issues * `exporter` attribute for metrics generated from `otlp` sub-exporters now includes endpoint for better visibility and to segregate them from top-level `loadbalancing` exporter - was `"exporter": "loadbalancing"`, now `"exporter": "loadbalancing/127.0.0.1:4317"` * logs, generated by `otlp` sub-exporters now includes additional attribute `endpoint` with endpoint value with the same reasons as for metrics #### Link to tracking issue Fixes open-telemetry#35378 Fixes open-telemetry#16826 #### Testing Proposed changes was heavily tested on large K8s environment with set of different scale-up/scale-down event using persistent queue configuration - no data loss were detected, everything works as expected #### Documentation `README.md` was updated to reflect new configuration parameters available. Sample `config.yaml` was updated as well
Component(s)
exporter/loadbalancing
What happened?
Description
We have used the DNS resolver and K8S resolver which seem to both cause data loss, because they route based on the IP address.
We are using the load balancing exporter for tail sampling, but this problem is generally applicable to this exporter.
When our downstream services changes scales/deploys, the IP addresses change. And, since the IP address is tied to exporter, the spans get trapped in that exporter's queue, with no way out. That exporter is deleted along with the data that lives in its queue.
Everytime we deploy we get a bunch of dropped spans as a result.
Steps to Reproduce
Set up the loadbalancing exporter with the DNS resolver, then change the IP address that the DNS record resolves to. You will see
otelcol_exporter_send_failed_spans
tick up.Expected Result
Data is not dropped but re-balanced / successfully exported.
Actual Result
Data is lost along with the exporter that is deleted due to its IP address no longer being active.
Potential solutions
We are using a stateful set, so the DNS records / pod names are stable, i.e.
pod-0, pod-1, ... , pod-n
. If the exporters were attached to these names instead of the IP addresses, this would solve the issue. This may be possible using the pod hostname or SRV record.Another potential solution, which may or may not be possible, is to release the data trapped in the exporter queue, when an exporter is deleted.
Collector version
v0.109.0
Environment information
Environment
Kubernetes, arm64
OpenTelemetry Collector configuration
Log output
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: