Support for containers running on macvlan networks #96

lox · 2023-05-14T11:04:00Z

I have containers that run on macvlan networks (to allow broadcast/multicast easily). I'd love to be able to use whalewall to firewall off web admin ports, but allow access to things like DNS.

An example might be adguard, which needs 53/tcp and 53/udp accessible, but I'd like to limit access to 80/443 to only a specific network (traefik vs the macvlan network).

Might something like that be in-scope for whalewall?

The text was updated successfully, but these errors were encountered:

capnspacehook · 2023-05-14T12:11:29Z

I've never used macvlan networks, but that should be possible now. Try specifying the network for your output rules you want to limit to a single network and let me know if that works.

lox · 2023-05-14T21:46:13Z

Interesting, looks like macvlan containers aren't subject to a lot of the standard iptables chains and need something like this: https://github.com/deitch/ctables

esev · 2023-08-08T16:58:48Z

I've been looking into something similar and discovered whalewall. I wonder if it'd be possible to move all the rules into the container's network namespace. The NetworkSettings.SandboxKey, from Docker inspect, seems to point to a file related to the network namespace. I haven't experimented enough yet, but I wonder if opening that file is all that would be needed to use it with nftables.WithNetNSFd.

Bonus: no cleanup would be needed when a container exits, as all the rules should go away along with the namespace when the namespace is destroyed.

capnspacehook · 2023-08-22T23:22:35Z

This should be possible, the only downside is the nftables rules could be viewed or modified by the root user in the Docker container, since the rules would be in the container's namespace instead of the host's.

esev · 2023-08-23T01:32:50Z

the only downside is the nftables rules could be viewed or modified by the root user in the Docker container,

I think this would only be possible for containers with the NET_ADMIN capability. But I agree, the more generic solution would be the one implemented today.

Jip-Hop · 2024-02-07T11:26:16Z

Bonus: no cleanup would be needed when a container exits, as all the rules should go away along with the namespace when the namespace is destroyed.

I did a little experiment with firewall rules in the container namespace. It is inspired by how kubernetes/podman pods work.

Upsides:

No need to mount docker socket
No docker polling/event listening
Rules are guaranteed to be setup before the restricted container starts
Plain nftables rules
Cleans up after itself

Downsides:

More yaml
Extra containers (1 stopped + 1 running per network namespace)

services:

  pod:
    # Long running process to keep the network namespace alive
    # Stopping (or restarting) this container will kill the network in dependent containers
    command: sleep infinity
    image: alpine
    # Use a non-root user
    user: 1000:1000
    # Use init, as sleep isn't intended to run as pid 1
    # https://daveiscoding.com/why-do-you-need-an-init-process-inside-your-docker-container-pid-1
    init: true
    # Drop all capabilities and make read-only, this container does nothing
    cap_drop:
      - ALL
    read_only: true
    # Bonus: block DNS tunneling by disabling DNS forwarding (forward to 0.0.0.0)
    # Containers can still resolve hostnames of containers on the same network (e.g. hostname_demo),
    # but they can't resolve public domains such as google.com
    # However, this also effects /etc/resolv.conf inside the firewall container
    # https://github.com/moby/moby/issues/19474#issuecomment-276406305
    dns: 0.0.0.0

  firewall:
    # Use firewalld, or any other image with nftables installed
    image: quay.io/firewalld/firewalld
    # Run as root user
    user: 0:0
    # Give only the required NET_ADMIN capability
    cap_drop:
      - ALL
    cap_add: 
      - NET_ADMIN
    depends_on:
      pod:
        # Compose should restart the firewall after it updates pod
        # This applies to explicit restart controlled by a Compose operation only!
        restart: true
        condition: service_started
    # Join the network namespace of pod
    network_mode: service:pod
    command: >
      sh -c "
            echo 'Setup a new table and chain...' &&
            nft add table inet filter &&
            nft add chain inet filter output { type filter hook output priority 0\\; } &&
            
            echo 'Allow private network ranges...' &&
            nft add rule inet filter output ip daddr 10.0.0.0-10.255.255.255 accept &&
            nft add rule inet filter output ip daddr 172.16.0.0-172.31.255.255 accept &&
            nft add rule inet filter output ip daddr 192.168.0.0-192.168.255.255 accept &&
            nft add rule inet filter output ip daddr 127.0.0.0-127.255.255.255 accept &&
            
            echo 'Allow 1.1.1.1 as example...' &&
            nft add rule inet filter output ip daddr 1.1.1.1 accept &&
            
            echo 'Drop connections to all other IP addresses...' &&
            nft add rule inet filter output drop &&

            nft list table inet filter
          "

  restricted:
    image: alpine
    # Start this container once the firewall has been setup
    depends_on:
      firewall:
        restart: true
        condition: service_completed_successfully
    network_mode: service:pod
    # Run as regular user
    user: 1000:1000
    init: true
    # Without any capabilities
    cap_drop:
      - ALL
    # Show the firewall is effective
    command: >
      sh -c "
            echo 'Resolving internal hostname works:'
            ping hostname_demo -c 1
            echo
            echo 'Resolving public domain fails:'
            wget -T 1 -t 1 google.com -O - | head
            echo
            echo 'Accessing allowed IP address works:'
            wget 1.1.1.1 -O - | head -n 21 | tail -n 17
            echo
            echo 'Accessing any other IP address fails:'
            wget -T 1 -t 1 40.89.244.232 -O - | head
            exit 0
          "

  # Bonus: another container on the same network to demo resolving internal DNS names
  hostname_demo:
    image: alpine
    user: 1000:1000
    init: true
    cap_drop:
      - ALL
    read_only: true
    command: sleep infinity

  # Bonus: try wiping the firewall rules as root user
  try_firewall_wipe:
    image: quay.io/firewalld/firewalld
    depends_on:
      restricted:
        restart: true
        condition: service_completed_successfully
    network_mode: service:pod
    user: 0:0
    init: true
    cap_drop:
      - ALL
    # Root can't wipe the firewall rules without the NET_ADMIN capability
    command: >
      sh -c "
            echo 'Try wiping the firewall as root user...'
            nft flush ruleset
            echo 'Demo is finished!'
          "

capnspacehook added the question Further information is requested label May 14, 2023

capnspacehook added enhancement New feature or request and removed question Further information is requested labels Jun 4, 2023

capnspacehook added the help wanted Extra attention is needed label Jun 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for containers running on macvlan networks #96

Support for containers running on macvlan networks #96

lox commented May 14, 2023

capnspacehook commented May 14, 2023 •

edited

Loading

lox commented May 14, 2023

esev commented Aug 8, 2023 •

edited

Loading

capnspacehook commented Aug 22, 2023

esev commented Aug 23, 2023 •

edited

Loading

Jip-Hop commented Feb 7, 2024

Support for containers running on macvlan networks #96

Support for containers running on macvlan networks #96

Comments

lox commented May 14, 2023

capnspacehook commented May 14, 2023 • edited Loading

lox commented May 14, 2023

esev commented Aug 8, 2023 • edited Loading

capnspacehook commented Aug 22, 2023

esev commented Aug 23, 2023 • edited Loading

Jip-Hop commented Feb 7, 2024

Upsides:

Downsides:

capnspacehook commented May 14, 2023 •

edited

Loading

esev commented Aug 8, 2023 •

edited

Loading

esev commented Aug 23, 2023 •

edited

Loading