Optionally allow specification of custom DNS resolver #584

oxtoacart · 2023-10-05T16:52:39Z

I tested this on route d1192fd7-4b4f-48dc-87d7-d99850445f53. I defaulted dns-servers to 172.16.0.53:53,8.8.8.8:53 and put in some debug statements to make sure our custom dialer is being used (it is). I was able to successfully proxy through the host.

This will allow us to specify explicit nameservers for http-proxy to use, bypassing Docker for DNS. This is useful because Docker has been hanging lately, causing name resolution on the proxies to fail.

oxtoacart · 2023-10-06T11:21:20Z

I've realized that this isn't working with Slack, so I have some work to do I guess.

oxtoacart · 2023-10-06T13:47:53Z

custom_dns.go

+					// Google anomaly detection can be triggered very often over IPv6.
+					// Prefer IPv4 to mitigate, see issue #97
+					// If no IPv4 is available, fall back to IPv6
+					for _, candidate := range ips {


This replaces the logic from preferIPv4.

oxtoacart · 2023-10-06T14:23:33Z

Okay, this is now ready for review. I verified that it works both with the system resolver and with specifying 172.16.0.53:53,8.8.8.8:53 as the DNS servers. And Slack is happy.

Crosse

This is pretty neat! I'll defer to @hwh33 since he's more familiar with the codebase, but this seems good to me.

Crosse · 2023-10-06T14:46:43Z

custom_dns.go

+			r := &net.Resolver{
+				PreferGo: true,
+				Dial: func(ctx context.Context, network, address string) (net.Conn, error) {
+					return netx.DialContext(ctx, "udp", dnsServer)


For responses that are larger than a full-sized UDP payload can be, the server may tell the client to retry over TCP. While this isn't terribly common, it can happen quite a lot with larger TXT and DNSSEC records. Since (eventually) the Go resolver is only looking for A and AAAA records, this is probably fine, but it's worth noting.

I don't understand why we'd ever want to overwrite the network parameter... Shouldn't we trust that the Go DNS resolver is specifying the correct network?

Cool. It looks like both unbound and Google support TCP connections on port 53, so I'll just pass through the protocol.

Crosse · 2023-10-06T20:25:27Z

I have another take on this in getlantern/lantern-cloud#423 as well. I haven't had time to test it yet, but manually running docker run ... --mount ... worked, so I'm hopeful.

oxtoacart · 2023-10-06T22:03:37Z

Closing in favor of https://github.com/getlantern/lantern-cloud/pull/423

oxtoacart · 2023-10-10T19:38:20Z

@Crosse I've changed this so that it resolves in parallel now. I tested the parallel resolution on route 4f549ac2-6371-4de9-84e2-1a4ec65cf1e4 and I'm able to successfully proxy through it and don't see DNS resolution errors in the log.

hwh33 · 2023-10-10T21:48:05Z

custom_dns.go

+
+// Returns a dialer that uses custom DNS servers to resolve the host. It uses all DNS servers
+// in parallel and uses the first response it gets.
+func customDNSDialer(dnsServers []string, timeout time.Duration) (func(context.Context, string, string) (net.Conn, error), error) {


I think the parameter should be named dialTimeout. As it stands, it would intuitively be the resolution timeout (to me anyway).

It is the resolution timeout. I just happen to also set it as a dial timeout because there's no point waiting to dial past the resolution timeout.

Oh I see, it is a dialTimeout, even I was confused!

hwh33 · 2023-10-10T21:50:37Z

custom_dns.go

+			r := &net.Resolver{
+				PreferGo: true,
+				Dial: func(ctx context.Context, network, address string) (net.Conn, error) {
+					return netx.DialContext(ctx, "udp", dnsServer)


I don't understand why we'd ever want to overwrite the network parameter... Shouldn't we trust that the Go DNS resolver is specifying the correct network?

hwh33 · 2023-10-10T21:55:55Z

custom_dns.go

+				var resolveErr error
+				select {
+				case resolveErr = <-errs:
+					// got an error
+				default:
+					// no error, we just timed out
+				}
+				return nil, errors.New("unable to resolve host %v, last resolution error: %v", host, resolveErr)


This will be mildly confusing in the case of a timeout, where resolveErr will be nil. You could just return different errors in each of the select cases.

hwh33 · 2023-10-10T21:57:01Z

custom_dns.go

+
+		resolvedAddr := fmt.Sprintf("%s:%s", ip, port)
+		d := &net.Dialer{
+			Deadline: time.Now().Add(timeout),


Nit: it'd be a little more direct to just set Timeout: timeout (it's kind of weird that net.Dialer supports deadlines and timeouts simultaneously).

Oh cool, they must have added that somewhere along the way. I have a memory of only Deadline being available.

hwh33 · 2023-10-10T22:02:03Z

custom_dns.go

+			errors <- err
+			return
+		}
+		if len(ips) > 0 {


This will mask misses as timeouts. We could instead have a miss channel or something

It turns out that if the DNS resolver can't find the host, it will return a "no such host" error, so I don't think we have to worry about masking. I did adjust the logic so that if all resolvers error, we fail immediately instead of waiting to hit the timeout. I added a unit test for that.

Thanks for digging into it!

hwh33 · 2023-10-10T22:02:58Z

custom_dns.go

+		}
+		if len(ips) > 0 {
+			// Google anomaly detection can be triggered very often over IPv6.
+			// Prefer IPv4 to mitigate, see issue #97


What is this referencing? https://github.com/getlantern/engineering/issues/97 doesn't look related

Oh I see, this comment was moved from elsewhere. @oxtoacart do you know what the deal is with this? I don't quite understand it. Is the comment saying that Google services don't like it when our proxy connects over IPv6?

It's referencing https://github.com/getlantern/http-proxy-lantern/issues/97

IIRC, we were getting tons of CAPTCHAs and changing this to force IPv4 reduced that quite a bit.

oxtoacart · 2023-10-11T15:54:57Z

@hwh33 Thanks for the great feedback! I've addressed your comments and retested on route d1192fd7-4b4f-48dc-87d7-d99850445f53.

hwh33 · 2023-10-12T00:45:20Z

custom_dns.go

+					errorCount++
+					if errorCount == len(resolvers) {


hwh33 · 2023-10-12T00:46:28Z

http_proxy_test.go

+	start := time.Now()
+	_, err = d(context.Background(), "tcp", "blubbaasdfsadfsadf.dude:443")
+	require.Error(t, err)
+	fmt.Println(err)


Nit: leftover print statement

hwh33

One nit-pick, but looks good to me! Thanks for addressing that stuff =)

Crosse

I'm generally good with this, code-wise. I would suggest not having a default value for the DNS servers to use and making it an explicit opt-in (principal of least astonishment).

After thinking about it longer, my only reservation is parallelizing the DNS queries. In our current situation, we would send simultaneous queries to our local resolver and Google and let them race for an answer, which sounds good on the face of it. But note that each of our phosts average around 3,000-5,000 queries per second, for an aggregate of around 100,000 queries per second. The reason Colin and I began using a local resolver in the first place was because we suspected that we were having our DNS queries rate-limited by the public resolver we had configured, and this was at much lower levels that we see today. If we unconditionally send queries to some non-Lantern-controlled DNS servers, we risk the same result and, potentially, a full block.

In my opinion, the best solution for all of this is actually to fix our local resolver issues (i.e., unbound segfaulting and locking up; see getlantern/lantern-cloud#428 for an idea of replacing Unbound altogether with BIND) and, eventually, run our own caching resolvers in the lantern cloud to act as upstreams for all of the phost-local forwarding resolvers.

oxtoacart · 2023-10-12T15:22:08Z

In my opinion, the best solution for all of this is actually to fix our local resolver issues (i.e., unbound segfaulting and locking up; see https://github.com/getlantern/lantern-cloud/pull/428 for an idea of replacing Unbound altogether with BIND) and, eventually, run our own caching resolvers in the lantern cloud to act as upstreams for all of the phost-local forwarding resolvers.

I agree 100%.

myleshorton · 2023-10-20T10:02:54Z

Should we close this one then? I generally agree that hammering Google DNS is probably not a great idea.

Crosse · 2023-10-20T13:27:24Z

Should we close this one then?

I'm on the fence. I think having the option to let lantern-proxy do custom DNS resolution could be useful in the future even if we don't use it right now, but I also wonder if we should avoid adding a knob for functionality we don't intend to use in the near-term.

oxtoacart · 2023-10-20T14:17:07Z

I also wonder if we should avoid adding a knob for functionality we don't intend to use in the near-term.

Yeah, I would just close this PR and if you ever need that knob in the future, you can resurrect it.

oxtoacart force-pushed the custom-dns branch 6 times, most recently from cd7537c to 8a49af2 Compare October 6, 2023 00:29

oxtoacart marked this pull request as ready for review October 6, 2023 02:03

oxtoacart requested review from hwh33 and Crosse October 6, 2023 02:03

oxtoacart force-pushed the custom-dns branch 7 times, most recently from 583db73 to f0698d4 Compare October 6, 2023 10:59

oxtoacart marked this pull request as draft October 6, 2023 12:52

oxtoacart force-pushed the custom-dns branch from f0698d4 to a53bbeb Compare October 6, 2023 12:56

Optionally allow specification of one or more DNS servers

28fbe61

oxtoacart force-pushed the custom-dns branch from a53bbeb to 28fbe61 Compare October 6, 2023 13:21

oxtoacart commented Oct 6, 2023

View reviewed changes

oxtoacart force-pushed the custom-dns branch from 918db44 to 59c2f46 Compare October 6, 2023 13:56

Removed preferIPv4

b0d3260

oxtoacart force-pushed the custom-dns branch from 59c2f46 to b0d3260 Compare October 6, 2023 14:23

oxtoacart marked this pull request as ready for review October 6, 2023 14:23

Crosse reviewed Oct 6, 2023

View reviewed changes

oxtoacart closed this Oct 6, 2023

oxtoacart reopened this Oct 10, 2023

oxtoacart force-pushed the custom-dns branch from cc80054 to e61c2bd Compare October 10, 2023 18:52

Use DNS resolvers in parallel

3b9fb26

oxtoacart force-pushed the custom-dns branch from e61c2bd to 3b9fb26 Compare October 10, 2023 19:37

hwh33 reviewed Oct 10, 2023

View reviewed changes

oxtoacart force-pushed the custom-dns branch from 0c6ad68 to 705d649 Compare October 11, 2023 15:42

Code review updates

c3170a9

oxtoacart force-pushed the custom-dns branch from 705d649 to c3170a9 Compare October 11, 2023 15:54

hwh33 reviewed Oct 12, 2023

View reviewed changes

custom_dns.go

Comment on lines +71 to +72

errorCount++

if errorCount == len(resolvers) {

Copy link

Contributor

hwh33 Oct 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

hwh33 reviewed Oct 12, 2023

View reviewed changes

hwh33 approved these changes Oct 12, 2023

View reviewed changes

Crosse reviewed Oct 12, 2023

View reviewed changes

Crosse closed this Oct 20, 2023

Optionally allow specification of custom DNS resolver #584

Optionally allow specification of custom DNS resolver #584

Conversation

oxtoacart commented Oct 5, 2023 • edited Loading

oxtoacart commented Oct 6, 2023

Choose a reason for hiding this comment

oxtoacart commented Oct 6, 2023

Crosse left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Crosse commented Oct 6, 2023

oxtoacart commented Oct 6, 2023

oxtoacart commented Oct 10, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hwh33 Oct 10, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

oxtoacart commented Oct 11, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hwh33 left a comment

Choose a reason for hiding this comment

Crosse left a comment

Choose a reason for hiding this comment

oxtoacart commented Oct 12, 2023

myleshorton commented Oct 20, 2023

Crosse commented Oct 20, 2023

oxtoacart commented Oct 20, 2023

oxtoacart commented Oct 5, 2023 •

edited

Loading

hwh33 Oct 10, 2023 •

edited

Loading