Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unclear "Use of Closed Network Connection" Error #405

Open
abraithwaite opened this issue May 31, 2024 · 2 comments
Open

Unclear "Use of Closed Network Connection" Error #405

abraithwaite opened this issue May 31, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@abraithwaite
Copy link

abraithwaite commented May 31, 2024

Describe the bug

When using clickhouse-go with clickhouse-cloud, we observe these errors happening during what we suspect to be auto-scaling events on the cloud side.

{"time":"2024-05-31T16:22:56.043609257Z","level":"INFO","source":{"function":"github.com/runreveal/lib/await.(*runner).Run.func1","file":"/home/runner/go/pkg/mod/github.com/runreveal/lib/[email protected]/await.go","line":110},"msg":"subroutine error: read:\n    github.com/ClickHouse/ch-go/proto.(*Reader).ReadFull\n        /home/runner/go/pkg/mod/github.com/!click!house/[email protected]/proto/reader.go:62\n  - read tcp 172.30.76.61:54124->35.82.252.60:9440: use of closed network connection"}

I believe it's because clickhouse proactively closes network connections that have encountered some kind of exception? It would be nice to have either a clearer error message here, or even better automatic reconnection / retries in the event of a failure here.

Getting this particular error bubbled up to the application isn't particularly useful. The only thing we can do is kill the client and recreate it before re-issuing the query because we can no longer trust the connection.

To be clear: we are not closing the connection on our end, so for whatever reason clickhouse server has to close the connection proactively, I think this library should be able to catch and handle that situation for a cleaner shutdown with a sentinel error value at the very least so we can make a more informed decision about how to handle it.

Steps to reproduce

  1. Create a clickhouse client connection to a clickhouse-cloud instance in Go.
  2. Trigger some scaling events while attempting to insert large batches.
  3. Observe the error

Expected behaviour

We get a better error returned or have options to configure retries, or both.

Error log

{"time":"2024-05-31T16:22:56.043609257Z","level":"INFO","source":{"function":"github.com/runreveal/lib/await.(*runner).Run.func1","file":"/home/runner/go/pkg/mod/github.com/runreveal/lib/[email protected]/await.go","line":110},"msg":"subroutine error: read:\n    github.com/ClickHouse/ch-go/proto.(*Reader).ReadFull\n        /home/runner/go/pkg/mod/github.com/!click!house/[email protected]/proto/reader.go:62\n  - read tcp 172.30.76.61:54124->35.82.252.60:9440: use of closed network connection"}

Configuration

Environment

  • Client version: github.com/ClickHouse/clickhouse-go/v2 v2.23.2
  • ch-go version: v0.61.5
  • Language version: 1.22.3
  • OS: Ubuntu Linux 22.04

ClickHouse server

  • ClickHouse Server version: 24.2.2
  • ClickHouse Server non-default settings, if any: clickhouse-cloud default settings.
  • CREATE TABLE statements for tables involved:
  • Sample data for all these tables, use clickhouse-obfuscator if necessary
@abraithwaite abraithwaite added the bug Something isn't working label May 31, 2024
@abraithwaite abraithwaite changed the title Use of Closed Network Connection Unclear Use of Closed Network Connection Error May 31, 2024
@abraithwaite abraithwaite changed the title Unclear Use of Closed Network Connection Error Unclear "Use of Closed Network Connection" Error May 31, 2024
charredlot pushed a commit to charredlot/ch-go that referenced this issue Jun 24, 2024
…#405)

When the server closes the connection unexpectedly, the client
will call cancelQuery (e.g. when client.packet fails). In
client.cancelQuery, if client.flushBuf has data to flush, it will
return a non-nil error and return early without calling conn.Close.
This prevents the chpool from removing the client after client.Do
and leaves the client.conn in a bad state such that future writes will
always fail with a "broken pipe" error.
charredlot pushed a commit to charredlot/ch-go that referenced this issue Jun 24, 2024
…#405)

When the server closes the connection unexpectedly, the client
will call cancelQuery (e.g. when client.packet fails). In
client.cancelQuery, if client.flushBuf has data to flush, it will
return a non-nil error and return early without calling conn.Close.
This prevents the chpool from removing the client after client.Do
and leaves the client.conn in a bad state such that future writes will
always fail with a "broken pipe" error.
@charredlot
Copy link
Contributor

not sure if it's the same issue, but ran into a similar problem with the chpool client getting stuck in a bad state when the server-side disconnects...made a PR here #409

charredlot added a commit to charredlot/ch-go that referenced this issue Jun 26, 2024
…#405)

When the server closes the connection unexpectedly, the client
will call cancelQuery (e.g. when client.packet fails). In
client.cancelQuery, if client.flushBuf has data to flush, it will
return a non-nil error and return early without calling conn.Close.
This prevents the chpool from removing the client after client.Do
and leaves the client.conn in a bad state such that future writes will
always fail with a "broken pipe" error.
ernado added a commit that referenced this issue Jul 1, 2024
fix(query): always close client.conn in cancelQuery (issue #405)
@cwegener
Copy link

FWIW, I can see these errors when using a local ClickHouse server standalone instance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants