Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconnections fail (perhaps only via tor) #205

Open
tsjk opened this issue Mar 13, 2023 · 14 comments
Open

Reconnections fail (perhaps only via tor) #205

tsjk opened this issue Mar 13, 2023 · 14 comments

Comments

@tsjk
Copy link

tsjk commented Mar 13, 2023

I've recently started to use this. I set up a teosd service with tor support (using communication with tor's control port). The cln client (v22.11) always uses a proxy.
The problem is that it works for a while and then just stops. I get like repeats of

UNUSUAL plugin-watchtower-client: <tower_id> is unreachable. Adding <id> to pending

This goes on for hours.
Abandoning the tower on the client side and re-registering it makes it work for a while again - without touching the server.
The client and server are on different systems, different Internet links, and use different tor daemons (for clarity). Both sides use tor v0.4.7.13.

@sr-gi
Copy link
Member

sr-gi commented Mar 13, 2023

It may be helpful if you could provide some Tor logs too, I'm pretty clueless otherwise.

Also, does this happens only with your tower? Have you tried others running over Tor? (e.g. #158 (comment))

@tsjk
Copy link
Author

tsjk commented Mar 13, 2023

I currently only use my own tower.
Regarding tor logs, I assume you mean on the client. But, I have logging set to notice, and neither the server nor the client says anything. I could try to increase the debug verbosity?

@sr-gi
Copy link
Member

sr-gi commented Mar 13, 2023

I currently only use my own tower.
Regarding tor logs, I assume you mean on the client. But, I have logging set to notice, and neither the server nor the client says anything. I could try to increase the debug verbosity?

I actually meant logs from the Tor daemon (on the client site indeed). You may also increase deps verbosity to see if something is being logged in the plugin logs.

@tsjk
Copy link
Author

tsjk commented Mar 13, 2023

deps verbosity?

@sr-gi
Copy link
Member

sr-gi commented Mar 13, 2023

deps verbosity?

Oh sorry, nvm, we only have that distinction on the tower-side, not on the client side (the tower side does only log lines regarding dependencies if they are above warning).

Increasing the debug verbosity might help indeed.

@tsjk
Copy link
Author

tsjk commented Mar 13, 2023

Eh. Well. We'll see.
I was actually a bit sloppy and just used the previously running tor relay on the system for CLN. But, when wanting to send debug info I kind of didn't like the idea of sending debug data from a live relay, and so I migrated my CLN away from the tor relay to a tor client. Since then I haven't observed the problem - of course! (>_<)
I still get disconnects, but now it says

INFO    plugin-watchtower-client: Retrying tower <tower_id>
...
...
INFO    plugin-watchtower-client: Retry strategy succeeded for <tower_id>

after a few seconds... Previously I didn't see it retrying at all.
I'll hold off a bit and update when (if?) the problem re-appears.

@sr-gi
Copy link
Member

sr-gi commented Mar 13, 2023

INFO    plugin-watchtower-client: Retrying tower <tower_id>
...
...
INFO    plugin-watchtower-client: Retry strategy succeeded for <tower_id>

after a few seconds... Previously I didn't see it retrying at all. I'll hold off a bit and update when (if?) the problem re-appears.

This is more the expected behavior. If a post request times out or cannot reach its destination, a retrier is created and data is passed to it. The retrier implements an exponential backoff strategy until the data is finally delivered, or it ends up giving up.

@tsjk
Copy link
Author

tsjk commented Mar 15, 2023

I still think something is amiss here.
I've only quickly glanced through some logs (set to info - debug logs are insanely hard to follow), and my intuition got me wondering whether it'd be useful to request a new circuit at reconnection attempts. I was thinking that perhaps an existing tower has some tor connection info associated with it. This would explain why abandonment followed by re-registration works.
I noticed intro_point_is_usable(): Intro point with auth key [scrubbed] had an error. Not usable during disconnect - which made me asking if the retry mechanism retries the wrong thing here. Will try to provide useful logs as time allows.

@sr-gi
Copy link
Member

sr-gi commented Mar 15, 2023

I was thinking that perhaps an existing tower has some tor connection info associated with it. This would explain why abandonment followed by re-registration works.

I don't think I follow

I noticed intro_point_is_usable(): Intro point with auth key [scrubbed] had an error. Not usable during disconnect - which made me asking if the retry mechanism retries the wrong thing here. Will try to provide useful logs as time allows.

Code-wise we don't do anything out of the ordinary here. If a proxy is provided, we just proxy the request through it. I'm not an expert using Tor though so I may be missing something. Let me know if there is anything I can help with, I've been running a tower on Tor for months so if something is iffy I may be able to find some useful logs.

@tsjk
Copy link
Author

tsjk commented Mar 15, 2023

Yeah, ok. What I was thinking that when the tower is abandoned and re-registered the reconnection attempt is different from a retry. I have no detailed insights into tor either (and I haven't checked what you do in the code), but afaik re-creating the connection to the tor socks proxy will result in requesting a new circuit, while re-use won't. So, if the circuit is broken making the reconnect fail, abandonment will likely necessarily discard the connection and re-registration will create a new connection to the socks proxy thereby resulting in a new circuit being built.

@mariocynicys
Copy link
Collaborator

Yeah, ok. What I was thinking that when the tower is abandoned and re-registered the reconnection attempt is different from a retry.

Nope, the only piece of network related info we store for a tower is its onion address. No circuit/connection info. Thus, abandoning a tower shouldn't do anything special.

One question though: can an application using Tor as a proxy only (no access to control port) request a new circuit?

@tsjk
Copy link
Author

tsjk commented Mar 15, 2023

Yeah, ok. What I was thinking that when the tower is abandoned and re-registered the reconnection attempt is different from a retry.

Nope, the only piece of network related info we store for a tower is its onion address. No circuit/connection info. Thus, abandoning a tower shouldn't do anything special.

One question though: can an application using Tor as a proxy only (no access to control port) request a new circuit?

I think I already answered your question above. :)

@mariocynicys
Copy link
Collaborator

I think I already answered your question above. :)

My bad xD.

but afaik re-creating the connection to the tor socks proxy will result in requesting a new circuit

This means we actually use a new circuit every post request.

reqwest::Client::builder()
.proxy(
reqwest::Proxy::http(proxy.get_socks_addr())
.map_err(|e| RequestError::ConnectionError(format!("{e}")))?,
)
.build()
.map_err(|e| RequestError::ConnectionError(format!("{e}")))?

@tsjk
Copy link
Author

tsjk commented Mar 16, 2023

I don't know how it works behind the scenes. Some APIs cache and reuse sockets. I think the magic on the tor side relies on the client address (this does not work for unix domain sockets, but that can be disregarded here). So, to get a new circuit one needs to change the source port of the call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants