Skip to content
This repository has been archived by the owner on Jun 21, 2024. It is now read-only.

hook-worker: deny traffic to internal IPs and IPv6 #35

Merged
merged 8 commits into from
May 6, 2024
Merged

Conversation

xvello
Copy link
Contributor

@xvello xvello commented May 3, 2024

Make sure that requests to our internal systems cannot be executed through the hook delivery system. In order to be resilient to DNS rebinding attacks, we need to insert our validation logic between the DNS resolution and the HTTP request execution. In this PR, I am wrapping reqwest's Resolving interface to intercept and validate the IPs returned by the system's DNS resolver.

  • Add PublicIPv4Resolver, that copies hyper's GaiResolver logic + filtering of the results via is_global_ipv4
    • It removes non-public IPs from the resolution results, to make sure that we don't request internal hosts. Since our infra is currently IPv4-only and unable to contact IPv6 destinations, we just filter out all IPv6 addresses for now. We'll implement non-public IPv6 filtering when needed
    • It returns a NoPublicIPv4Error if the filtered lists is empty, making sure we don't get hard to diagnose "network unreachable" errors. Make sure that this error is surfaced in logs
  • Add tests for the resolver + the worker itself

@xvello xvello force-pushed the xvello/privateip branch from 6ab6c2a to 250082c Compare May 3, 2024 13:09
Comment on lines +12 to +15
/// Returns [`true`] if the address appears to be a globally reachable IPv4.
///
/// Trimmed down version of the unstable IpAddr::is_global, move to it when it's stable.
fn is_global_ipv4(addr: &SocketAddr) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As clear as it gets, very nice 👍


// Execute the blocking call in a separate worker thread then process its result asynchronously.
// spawn_blocking returns a JoinHandle that implements Future<Result<(closure result), JoinError>>.
let future_result = spawn_blocking(resolve_host).map(|result| match result {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any risk of running out of blocking threads? Should we cache these resolutions in some way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • spawn_blocking's doc indicates that a thread poll is used, and tasks are queued if no thread is available. The default pool max size of 512 is way higher than our concurrency. That's also what the default resolver does.

  • The glibc already handles caching, so caching again here would be a premature optimization. Also, it makes debugging harder: I spend weeks working around grpc-java's internal DNS caching creating errors when services rolled out.

Copy link
Contributor

@bretthoerner bretthoerner May 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had this skeleton with all but the (sync) IP range blocking logic decided, where I stole from the reqwest GaiResolver and didn't use spawn_blocking: 825a9bc

🤷 If it's helpful, maybe I'm missing something there, I haven't woken up yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HyperGaiResolver::call boils down to tokio::task::spawn_blocking(move || { (&*name.host, 0).to_socket_addrs().map(|i| SocketAddrs { iter: i }), which is what I inlined here instead of adding another dependency.

I'm open to erroring out instead of filtering out, which could help with debugging. That would probably require us to implement filtering out private IPV6 too while it's not needed.

Comment on lines +41 to +42
#[envconfig(default = "false")]
pub allow_internal_ips: bool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this ever be true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

end2end tests against containerized endpoint, local testing, hobby deploy

@xvello xvello force-pushed the xvello/privateip branch 2 times, most recently from 61a25ce to 34cca52 Compare May 6, 2024 11:17
@xvello xvello force-pushed the xvello/privateip branch from 34cca52 to e040018 Compare May 6, 2024 11:22
@xvello xvello marked this pull request as ready for review May 6, 2024 11:40
@xvello xvello changed the title hook-worker: deny traffic to internal IPs hook-worker: deny traffic to internal IPs and IPv6 May 6, 2024
@xvello xvello requested a review from a team May 6, 2024 12:26
@xvello xvello requested a review from bretthoerner May 6, 2024 13:30
@xvello xvello force-pushed the xvello/privateip branch from ac81afe to f2df29f Compare May 6, 2024 14:11
Copy link
Contributor

@bretthoerner bretthoerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. I do feel like this may totally forgotten when it gets deployed on an IPv6 enabled network sometime (possibly far) in the future, I guess we're just banking on alerts to save us then?

@xvello
Copy link
Contributor Author

xvello commented May 6, 2024

when it gets deployed on an IPv6 enabled network sometime (possibly far) in the future, I guess we're just banking on alerts to save us then?

Because we preemptively filter out ipv6 addresses, deploying this service on an IPV6-enabled cluster will be a no-op instead of opening a potential security issue. We'll then schedule the work to safely check IPv6 records and enable support for IPv6-only destinations.

I think it would be fine to do that work right now if we want, I just wanted to reduce scope to be able to ship this before going out-of-office

@bretthoerner
Copy link
Contributor

Because we preemptively filter out ipv6 addresses, deploying this service on an IPV6-enabled cluster will be a no-op instead of opening a potential security issue. We'll then schedule the work to safely check IPv6 records and enable support for IPv6-only destinations.

Thanks, that makes sense. I have some very old PTSD around musl truncating the DNS response when there were a lot of addresses too return. We are of course not (currently) using musl, but I just had some unease about what exactly would be returned by getaddressinfo on an IPv6 enabled network. It looks like the default is both address types. (And glibc isn't so bad about the unrelated truncation either.)

@xvello xvello merged commit 281af61 into main May 6, 2024
4 checks passed
@xvello xvello deleted the xvello/privateip branch May 6, 2024 14:31
@xvello
Copy link
Contributor Author

xvello commented May 6, 2024

The glibc will return all the IPv4 & IPv6 extracted from the A, AAAA and CNAME records, mixed in a vector, so dual-stack endpoints should give us IPv4 transparently.

I think that you are mentioning the fact that musl can only read the first 512 bytes of DNS response, meaning that results can be truncated. In this case, it would be possible for us to fail if the response starts with IPv6 addresses indeed. I'd recommend using the pure-rust hickory_resolver crate if we want to move off the debian base image.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants