-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failover connector is not working with otlp grpc and otlp http but is working with syslog #34582
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
Hi @j-blomart, I think the issue is that you have the sending_queues enabled, the otlp and otlphttp exporters aren't going to return an error until the sending_queue is full, as as long as there is space on the queue to place the batch that is viewed as a successful export (the failover connector uses backwards propagated errors to trigger failover). Try repeating your test with disabling the sending_queue. For context, there was a feature request for adding a sending_queue directly to the failover connector instead #33007. I'll have a PR for this functionality shortly |
@akats7 Changed otlp producer config to : receivers:
namedpipe:
path: ./logs
exporters:
otlp/a:
endpoint: localhost:4317
retry_on_failure:
enabled: false
tls:
insecure: true
sending_queue:
enabled: false
otlp/b:
endpoint: localhost:4318
retry_on_failure:
enabled: false
tls:
insecure: true
sending_queue:
enabled: false
connectors:
failover:
retry_interval: 10s
retry_gap: 3s
priority_levels:
- [logs/a]
- [logs/b]
service:
telemetry:
metrics:
level: none
pipelines:
logs:
receivers: [namedpipe]
exporters: [failover]
logs/a:
receivers: [failover]
exporters: [otlp/a]
logs/b:
receivers: [failover]
exporters: [otlp/b] The logs show export failure and failover to otlp/b exporter : $ ../otelcol --config otelcol_producer.yaml &
$ 2024-08-09T19:51:15.555Z info [email protected]/service.go:116 Setting up own telemetry...
2024-08-09T19:51:15.555Z info [email protected]/service.go:119 OpenCensus bridge is disabled for Collector telemetry and will be removed in a future version, use --feature-gates=-service.disableOpenCensusBridge to re-enable
2024-08-09T19:51:15.555Z info [email protected]/service.go:172 Skipped telemetry setup. {"address": ":8888", "metrics level": "None"}
2024-08-09T19:51:15.555Z info [email protected]/service.go:198 Starting otelcol-csw... {"Version": "1.0.0", "NumCPU": 12}
2024-08-09T19:51:15.555Z info extensions/extensions.go:34 Starting extensions...
2024-08-09T19:51:15.556Z info adapter/receiver.go:46 Starting stanza receiver {"kind": "receiver", "name": "namedpipe", "data_type": "logs"}
2024-08-09T19:51:15.556Z info [email protected]/service.go:224 Everything is ready. Begin running and processing data.
2024-08-09T19:51:15.556Z info localhostgate/featuregate.go:63 The default endpoints for all servers in components have changed to use localhost instead of 0.0.0.0. Disable the feature gate to temporarily revert to the previous default. {"feature gate ID": "component.UseLocalHostAsDefaultHost"}
$ ../otelcol --config otelcol_receiver_a.yaml &
$ 2024-08-09T19:51:23.868Z info [email protected]/service.go:116 Setting up own telemetry...
2024-08-09T19:51:23.869Z info [email protected]/service.go:119 OpenCensus bridge is disabled for Collector telemetry and will be removed in a future version, use --feature-gates=-service.disableOpenCensusBridge to re-enable
2024-08-09T19:51:23.869Z info [email protected]/service.go:172 Skipped telemetry setup. {"address": ":8888", "metrics level": "None"}
2024-08-09T19:51:23.869Z info [email protected]/exporter.go:280 Development component. May change in the future. {"kind": "exporter", "data_type": "logs", "name": "debug"}
2024-08-09T19:51:23.869Z info [email protected]/service.go:198 Starting otelcol-csw... {"Version": "1.0.0", "NumCPU": 12}
2024-08-09T19:51:23.869Z info extensions/extensions.go:34 Starting extensions...
2024-08-09T19:51:23.869Z info [email protected]/otlp.go:102 Starting GRPC server {"kind": "receiver", "name": "otlp", "data_type": "logs", "endpoint": "localhost:4317"}
2024-08-09T19:51:23.870Z info [email protected]/service.go:224 Everything is ready. Begin running and processing data.
2024-08-09T19:51:23.870Z info localhostgate/featuregate.go:63 The default endpoints for all servers in components have changed to use localhost instead of 0.0.0.0. Disable the feature gate to temporarily revert to the previous default. {"feature gate ID": "component.UseLocalHostAsDefaultHost"}
$ ../otelcol --config otelcol_receiver_b.yaml &
$ 2024-08-09T19:51:26.895Z info [email protected]/service.go:116 Setting up own telemetry...
2024-08-09T19:51:26.895Z info [email protected]/service.go:119 OpenCensus bridge is disabled for Collector telemetry and will be removed in a future version, use --feature-gates=-service.disableOpenCensusBridge to re-enable
2024-08-09T19:51:26.895Z info [email protected]/service.go:172 Skipped telemetry setup. {"address": ":8888", "metrics level": "None"}
2024-08-09T19:51:26.895Z info [email protected]/exporter.go:280 Development component. May change in the future. {"kind": "exporter", "data_type": "logs", "name": "debug"}
2024-08-09T19:51:26.895Z info [email protected]/service.go:198 Starting otelcol-csw... {"Version": "1.0.0", "NumCPU": 12}
2024-08-09T19:51:26.895Z info extensions/extensions.go:34 Starting extensions...
2024-08-09T19:51:26.895Z info [email protected]/otlp.go:102 Starting GRPC server {"kind": "receiver", "name": "otlp", "data_type": "logs", "endpoint": "localhost:4318"}
2024-08-09T19:51:26.896Z info [email protected]/service.go:224 Everything is ready. Begin running and processing data.
2024-08-09T19:51:26.896Z info localhostgate/featuregate.go:63 The default endpoints for all servers in components have changed to use localhost instead of 0.0.0.0. Disable the feature gate to temporarily revert to the previous default. {"feature gate ID": "component.UseLocalHostAsDefaultHost"}
$ while true; do sleep 3; echo 'test' > logs; done
2024-08-09T19:51:38.960Z info LogsExporter {"kind": "exporter", "data_type": "logs", "name": "debug", "resource logs": 1, "log records": 1}
2024-08-09T19:51:38.960Z info test receiver=a
{"kind": "exporter", "data_type": "logs", "name": "debug"}
2024-08-09T19:51:41.958Z info LogsExporter {"kind": "exporter", "data_type": "logs", "name": "debug", "resource logs": 1, "log records": 1}
2024-08-09T19:51:41.958Z info test receiver=a
{"kind": "exporter", "data_type": "logs", "name": "debug"}
2024-08-09T19:51:44.957Z info LogsExporter {"kind": "exporter", "data_type": "logs", "name": "debug", "resource logs": 1, "log records": 1}
2024-08-09T19:51:44.957Z info test receiver=a
{"kind": "exporter", "data_type": "logs", "name": "debug"}
$ ps -ef | grep otelcol_
jblomart 423055 292804 0 19:51 pts/8 00:00:00 ../otelcol --config otelcol_producer.yaml
jblomart 423100 292804 0 19:51 pts/8 00:00:00 ../otelcol --config otelcol_receiver_a.yaml
jblomart 423130 292804 0 19:51 pts/8 00:00:00 ../otelcol --config otelcol_receiver_b.yaml
jblomart 423327 292804 0 19:51 pts/8 00:00:00 grep --color=auto otelcol_
$ sudo kill -9 423100
$
[2]- Killed ../otelcol-csw --config otelcol_receiver_a.yaml
$ while true; do sleep 3; echo 'test' > logs; done
2024-08-09T19:52:15.556Z warn zapgrpc/zapgrpc.go:193 [core] [Channel #2 SubChannel #3]grpc: addrConn.createTransport failed to connect to {Addr: "127.0.0.1:4317", ServerName: "localhost:4317", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:4317: connect: connection refused" {"grpc_log": true}
2024-08-09T19:52:15.556Z error exporterhelper/common.go:296 Exporting failed. Rejecting data. Try enabling retry_on_failure config option to retry on retryable errors. Try enabling sending_queue to survive temporary failures. {"kind": "exporter", "data_type": "logs", "name": "otlp/a", "error": "rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing: dial tcp 127.0.0.1:4317: connect: connection refused\"", "rejected_items": 1}
go.opentelemetry.io/collector/exporter/exporterhelper.(*baseExporter).send
go.opentelemetry.io/collector/[email protected]/exporterhelper/common.go:296
go.opentelemetry.io/collector/exporter/exporterhelper.NewLogsRequestExporter.func1
go.opentelemetry.io/collector/[email protected]/exporterhelper/logs.go:134
go.opentelemetry.io/collector/consumer.ConsumeLogsFunc.ConsumeLogs
go.opentelemetry.io/collector/[email protected]/logs.go:26
go.opentelemetry.io/collector/consumer.ConsumeLogsFunc.ConsumeLogs
go.opentelemetry.io/collector/[email protected]/logs.go:26
github.com/open-telemetry/opentelemetry-collector-contrib/connector/failoverconnector.(*logsFailover).ConsumeLogs
github.com/open-telemetry/opentelemetry-collector-contrib/connector/[email protected]/logs.go:36
go.opentelemetry.io/collector/consumer.ConsumeLogsFunc.ConsumeLogs
go.opentelemetry.io/collector/[email protected]/logs.go:26
github.com/open-telemetry/opentelemetry-collector-contrib/internal/coreinternal/consumerretry.(*logsConsumer).ConsumeLogs
github.com/open-telemetry/opentelemetry-collector-contrib/internal/[email protected]/consumerretry/logs.go:37
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/adapter.(*receiver).consumerLoop
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/[email protected]/adapter/receiver.go:126
2024-08-09T19:52:15.556Z error exporterhelper/common.go:296 Exporting failed. Rejecting data. Try enabling retry_on_failure config option to retry on retryable errors. Try enabling sending_queue to survive temporary failures. {"kind": "exporter", "data_type": "logs", "name": "otlp/a", "error": "rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing: dial tcp 127.0.0.1:4317: connect: connection refused\"", "rejected_items": 1}
go.opentelemetry.io/collector/exporter/exporterhelper.(*baseExporter).send
go.opentelemetry.io/collector/[email protected]/exporterhelper/common.go:296
go.opentelemetry.io/collector/exporter/exporterhelper.NewLogsRequestExporter.func1
go.opentelemetry.io/collector/[email protected]/exporterhelper/logs.go:134
go.opentelemetry.io/collector/consumer.ConsumeLogsFunc.ConsumeLogs
go.opentelemetry.io/collector/[email protected]/logs.go:26
go.opentelemetry.io/collector/consumer.ConsumeLogsFunc.ConsumeLogs
go.opentelemetry.io/collector/[email protected]/logs.go:26
github.com/open-telemetry/opentelemetry-collector-contrib/connector/failoverconnector.(*logsFailover).FailoverLogs
github.com/open-telemetry/opentelemetry-collector-contrib/connector/[email protected]/logs.go:47
github.com/open-telemetry/opentelemetry-collector-contrib/connector/failoverconnector.(*logsFailover).ConsumeLogs
github.com/open-telemetry/opentelemetry-collector-contrib/connector/[email protected]/logs.go:41
go.opentelemetry.io/collector/consumer.ConsumeLogsFunc.ConsumeLogs
go.opentelemetry.io/collector/[email protected]/logs.go:26
github.com/open-telemetry/opentelemetry-collector-contrib/internal/coreinternal/consumerretry.(*logsConsumer).ConsumeLogs
github.com/open-telemetry/opentelemetry-collector-contrib/internal/[email protected]/consumerretry/logs.go:37
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/adapter.(*receiver).consumerLoop
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/[email protected]/adapter/receiver.go:126
2024-08-09T19:52:15.556Z error exporterhelper/common.go:296 Exporting failed. Rejecting data. Try enabling retry_on_failure config option to retry on retryable errors. Try enabling sending_queue to survive temporary failures. {"kind": "exporter", "data_type": "logs", "name": "otlp/a", "error": "rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing: dial tcp 127.0.0.1:4317: connect: connection refused\"", "rejected_items": 1}
go.opentelemetry.io/collector/exporter/exporterhelper.(*baseExporter).send
go.opentelemetry.io/collector/[email protected]/exporterhelper/common.go:296
go.opentelemetry.io/collector/exporter/exporterhelper.NewLogsRequestExporter.func1
go.opentelemetry.io/collector/[email protected]/exporterhelper/logs.go:134
go.opentelemetry.io/collector/consumer.ConsumeLogsFunc.ConsumeLogs
go.opentelemetry.io/collector/[email protected]/logs.go:26
go.opentelemetry.io/collector/consumer.ConsumeLogsFunc.ConsumeLogs
go.opentelemetry.io/collector/[email protected]/logs.go:26
github.com/open-telemetry/opentelemetry-collector-contrib/connector/failoverconnector.(*logsFailover).FailoverLogs
github.com/open-telemetry/opentelemetry-collector-contrib/connector/[email protected]/logs.go:47
github.com/open-telemetry/opentelemetry-collector-contrib/connector/failoverconnector.(*logsFailover).ConsumeLogs
github.com/open-telemetry/opentelemetry-collector-contrib/connector/[email protected]/logs.go:41
go.opentelemetry.io/collector/consumer.ConsumeLogsFunc.ConsumeLogs
go.opentelemetry.io/collector/[email protected]/logs.go:26
github.com/open-telemetry/opentelemetry-collector-contrib/internal/coreinternal/consumerretry.(*logsConsumer).ConsumeLogs
github.com/open-telemetry/opentelemetry-collector-contrib/internal/[email protected]/consumerretry/logs.go:37
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/adapter.(*receiver).consumerLoop
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/[email protected]/adapter/receiver.go:126
2024-08-09T19:52:15.558Z info LogsExporter {"kind": "exporter", "data_type": "logs", "name": "debug", "resource logs": 1, "log records": 1}
2024-08-09T19:52:15.558Z info test receiver=b
{"kind": "exporter", "data_type": "logs", "name": "debug"}
2024-08-09T19:52:16.557Z warn zapgrpc/zapgrpc.go:193 [core] [Channel #2 SubChannel #3]grpc: addrConn.createTransport failed to connect to {Addr: "127.0.0.1:4317", ServerName: "localhost:4317", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:4317: connect: connection refused" {"grpc_log": true}
2024-08-09T19:52:18.436Z warn zapgrpc/zapgrpc.go:193 [core] [Channel #2 SubChannel #3]grpc: addrConn.createTransport failed to connect to {Addr: "127.0.0.1:4317", ServerName: "localhost:4317", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:4317: connect: connection refused" {"grpc_log": true}
2024-08-09T19:52:18.557Z info LogsExporter {"kind": "exporter", "data_type": "logs", "name": "debug", "resource logs": 1, "log records": 1}
2024-08-09T19:52:18.557Z info test receiver=b
{"kind": "exporter", "data_type": "logs", "name": "debug"}
2024-08-09T19:52:20.642Z warn zapgrpc/zapgrpc.go:193 [core] [Channel #2 SubChannel #3]grpc: addrConn.createTransport failed to connect to {Addr: "127.0.0.1:4317", ServerName: "localhost:4317", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:4317: connect: connection refused" {"grpc_log": true}
2024-08-09T19:52:21.556Z info LogsExporter {"kind": "exporter", "data_type": "logs", "name": "debug", "resource logs": 1, "log records": 1}
2024-08-09T19:52:21.556Z info test receiver=b
{"kind": "exporter", "data_type": "logs", "name": "debug"}
2024-08-09T19:52:24.557Z info LogsExporter {"kind": "exporter", "data_type": "logs", "name": "debug", "resource logs": 1, "log records": 1}
2024-08-09T19:52:24.557Z info test receiver=b
{"kind": "exporter", "data_type": "logs", "name": "debug"}
2024-08-09T19:52:25.358Z warn zapgrpc/zapgrpc.go:193 [core] [Channel #2 SubChannel #3]grpc: addrConn.createTransport failed to connect to {Addr: "127.0.0.1:4317", ServerName: "localhost:4317", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:4317: connect: connection refused" {"grpc_log": true} |
Component(s)
connector/failover
What happened?
Description
When sending log signals to two different otlp grpc exporters through failover connector : the failover connector does not failover if a otlp grpc receiver is shut-down.
Steps to Reproduce
Expected Result
When the first opentelemetry collector listening for otlp grpc on port 4317 gets killed the failover mechanism should kick in and send the log signals to the second opentelemetry collector listening for otlp grpc on port 4318
Actual Result
The failover does not happen
Collector version
0.105.0
Environment information
Environment
Ubuntu 20.04
Golang 1.21.6
OpenTelemetry Collector configuration
Log output
Additional context
I tested otlp grpc, otlp http and syslog.
The failover connector does work with syslog exporter and syslog receiver.
This limits the functionality to log signals and with log signals they are re-encoded with rfc5424 (by default) :
trying to send log signals from one opentelemetry collector to another through failover connector would require transformations in order to finaly have the exact same log signal on the receiver side.
The text was updated successfully, but these errors were encountered: