Skip to content

Commit

Permalink
Merge pull request moby#48724 from robmry/no_dependency_on_filter_for…
Browse files Browse the repository at this point in the history
…ward_policy

Remove dependency on the filter-FORWARD policy
  • Loading branch information
robmry authored Oct 23, 2024
2 parents 34898da + aba8df7 commit 87365d9
Show file tree
Hide file tree
Showing 14 changed files with 348 additions and 6 deletions.
10 changes: 9 additions & 1 deletion integration/network/bridge/iptablesdoc/generated/new-daemon.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ Table `filter`:

Chain DOCKER (1 references)
num pkts bytes target prot opt in out source destination
1 0 0 DROP 0 -- !docker0 docker0 0.0.0.0/0 0.0.0.0/0

Chain DOCKER-ISOLATION-STAGE-1 (1 references)
num pkts bytes target prot opt in out source destination
Expand Down Expand Up @@ -54,6 +55,7 @@ Table `filter`:
-A FORWARD -o docker0 -j DOCKER
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A DOCKER ! -i docker0 -o docker0 -j DROP
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j RETURN
-A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP
Expand Down Expand Up @@ -108,7 +110,12 @@ But, when ICC is disabled, rule 6 is DROP, so it would need to be placed before
rule 5. Because the rules are generated in different places, that's a slightly
bigger change than it should be._

The DOCKER chain is empty, because there are no containers with port mappings yet.
The DOCKER chain has a single DROP rule for the bridge network, to drop any
packets routed to the network that have not originated in the network. Added by
[defaultDrop][21].
_This means there is no dependency on the filter-FORWARD chain's default policy.
Even if it is ACCEPT, packets will be dropped unless container ports/protocols
are published._

The DOCKER-ISOLATION chains implement inter-network isolation, all (unrelated)
packets are processed by these chains. The rule are inserted at the head of the
Expand All @@ -119,6 +126,7 @@ chain when a network is created, in [setINC][20].
packets that are destined for any other network are dropped.

[20]: https://github.com/moby/moby/blob/333cfa640239153477bf635a8131734d0e9d099d/libnetwork/drivers/bridge/setup_ip_tables_linux.go#L369
[21]: https://github.com/robmry/moby/blob/52c89d467fc5326149e4bbb8903d23589b66ff0d/libnetwork/drivers/bridge/setup_ip_tables_linux.go#L252

Table nat:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ The filter table is updated as follows:

Chain DOCKER (1 references)
num pkts bytes target prot opt in out source destination
1 0 0 DROP 0 -- !docker0 docker0 0.0.0.0/0 0.0.0.0/0

Chain DOCKER-ISOLATION-STAGE-1 (1 references)
num pkts bytes target prot opt in out source destination
Expand Down Expand Up @@ -63,6 +64,7 @@ The filter table is updated as follows:
-A FORWARD -o docker0 -j DOCKER
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A DOCKER ! -i docker0 -o docker0 -j DROP
-A DOCKER-ISOLATION-STAGE-1 ! -s 192.0.2.0/24 -o bridge1 -j DROP
-A DOCKER-ISOLATION-STAGE-1 ! -d 192.0.2.0/24 -i bridge1 -j DROP
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@ The filter table is:
Chain DOCKER (2 references)
num pkts bytes target prot opt in out source destination
1 0 0 ACCEPT 6 -- !bridge1 bridge1 0.0.0.0/0 192.0.2.2 tcp dpt:80
2 0 0 DROP 0 -- !docker0 docker0 0.0.0.0/0 0.0.0.0/0
3 0 0 DROP 0 -- !bridge1 bridge1 0.0.0.0/0 0.0.0.0/0

Chain DOCKER-ISOLATION-STAGE-1 (1 references)
num pkts bytes target prot opt in out source destination
Expand Down Expand Up @@ -71,6 +73,8 @@ The filter table is:
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A FORWARD -i bridge1 -o bridge1 -j DROP
-A DOCKER -d 192.0.2.2/32 ! -i bridge1 -o bridge1 -p tcp -m tcp --dport 80 -j ACCEPT
-A DOCKER ! -i docker0 -o docker0 -j DROP
-A DOCKER ! -i bridge1 -o bridge1 -j DROP
-A DOCKER-ISOLATION-STAGE-1 -i bridge1 ! -o bridge1 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j RETURN
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,8 @@ The filter table is the same as with the userland proxy enabled.
Chain DOCKER (2 references)
num pkts bytes target prot opt in out source destination
1 0 0 ACCEPT 6 -- !bridge1 bridge1 0.0.0.0/0 192.0.2.2 tcp dpt:80
2 0 0 DROP 0 -- !docker0 docker0 0.0.0.0/0 0.0.0.0/0
3 0 0 DROP 0 -- !bridge1 bridge1 0.0.0.0/0 0.0.0.0/0

Chain DOCKER-ISOLATION-STAGE-1 (1 references)
num pkts bytes target prot opt in out source destination
Expand Down Expand Up @@ -71,6 +73,8 @@ The filter table is the same as with the userland proxy enabled.
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A DOCKER -d 192.0.2.2/32 ! -i bridge1 -o bridge1 -p tcp -m tcp --dport 80 -j ACCEPT
-A DOCKER ! -i docker0 -o docker0 -j DROP
-A DOCKER ! -i bridge1 -o bridge1 -j DROP
-A DOCKER-ISOLATION-STAGE-1 -i bridge1 ! -o bridge1 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j RETURN
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Running the daemon with the userland proxy disabled then, as before, adding a ne
--subnet 192.0.2.0/24 --gateway 192.0.2.1 bridge1
docker run --network bridge1 -p 8080:80 --name c1 busybox

The filter table is the same as with the userland proxy enabled.
The filter table is largely the same as with the userland proxy enabled.

_Note that this means inter-network communication is disabled as-normal so,
although published ports will be directly accessible from a remote host
Expand Down Expand Up @@ -40,6 +40,9 @@ on the same host._
Chain DOCKER (2 references)
num pkts bytes target prot opt in out source destination
1 0 0 ACCEPT 6 -- !bridge1 bridge1 0.0.0.0/0 192.0.2.2 tcp dpt:80
2 0 0 DROP 0 -- !docker0 docker0 0.0.0.0/0 0.0.0.0/0
3 0 0 ACCEPT 1 -- * bridge1 0.0.0.0/0 0.0.0.0/0
4 0 0 DROP 0 -- !bridge1 bridge1 0.0.0.0/0 0.0.0.0/0

Chain DOCKER-ISOLATION-STAGE-1 (1 references)
num pkts bytes target prot opt in out source destination
Expand Down Expand Up @@ -76,6 +79,9 @@ on the same host._
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A DOCKER -d 192.0.2.2/32 ! -i bridge1 -o bridge1 -p tcp -m tcp --dport 80 -j ACCEPT
-A DOCKER ! -i docker0 -o docker0 -j DROP
-A DOCKER -o bridge1 -p icmp -j ACCEPT
-A DOCKER ! -i bridge1 -o bridge1 -j DROP
-A DOCKER-ISOLATION-STAGE-1 -i bridge1 ! -o bridge1 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j RETURN
Expand All @@ -87,6 +93,39 @@ on the same host._

</details>

However, a rule is added by [setICMP][5] to the DOCKER chain (shown below) to
allow ICMP. The equivalent IPv6 rule uses `-p icmpv6` rather than `-p icmp`,
so *ALL* ICMP message types are allowed.

_The ACCEPT rule as shown by `iptables -L` looks alarming until you spot that it's
for `prot 1`._

Because the ICMP rule (rule 3) is per-network, it is appended to the chain along
with the default-DROP rule (rule 4). So, it is likely to be separated from
per-port/protocol ACCEPT rules for published ports on the same network. But it
will always appear before the default-DROP.

_[RFC 4890 section 4.3][6] makes recommendations for filtering ICMPv6. These
have been considered, but the host firewall is not a network boundary in the
sense used by the RFC. So, Node Information and Router Renumbering messages are
not discarded, and experimental/unused types are allowed because they may be
needed._

Chain DOCKER (2 references)
num pkts bytes target prot opt in out source destination
1 0 0 ACCEPT 6 -- !bridge1 bridge1 0.0.0.0/0 192.0.2.2 tcp dpt:80
2 0 0 DROP 0 -- !docker0 docker0 0.0.0.0/0 0.0.0.0/0
3 0 0 ACCEPT 1 -- * bridge1 0.0.0.0/0 0.0.0.0/0
4 0 0 DROP 0 -- !bridge1 bridge1 0.0.0.0/0 0.0.0.0/0


-N DOCKER
-A DOCKER -d 192.0.2.2/32 ! -i bridge1 -o bridge1 -p tcp -m tcp --dport 80 -j ACCEPT
-A DOCKER ! -i docker0 -o docker0 -j DROP
-A DOCKER -o bridge1 -p icmp -j ACCEPT
-A DOCKER ! -i bridge1 -o bridge1 -j DROP


The nat table is:

Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
Expand Down Expand Up @@ -141,3 +180,5 @@ _And, the userland proxy won't be started for mapped ports._
[2]: https://github.com/moby/moby/blob/333cfa640239153477bf635a8131734d0e9d099d/libnetwork/drivers/bridge/setup_ip_tables_linux.go#L294
[3]: https://github.com/moby/moby/blob/675c2ac2db93e38bb9c5a6615d4155a969535fd9/libnetwork/drivers/bridge/port_mapping_linux.go#L477-L479
[4]: https://github.com/moby/moby/blob/333cfa640239153477bf635a8131734d0e9d099d/libnetwork/drivers/bridge/setup_ip_tables_linux.go#L290
[5]: https://github.com/robmry/moby/blob/d456d79cfc12cd7c801eebce0550b645c5343ca6/libnetwork/drivers/bridge/setup_ip_tables_linux.go#L390-L395
[6]: https://www.rfc-editor.org/rfc/rfc4890#section-4.3
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@ The filter table is updated as follows:
Chain DOCKER (2 references)
num pkts bytes target prot opt in out source destination
1 0 0 ACCEPT 6 -- !bridge1 bridge1 0.0.0.0/0 192.0.2.2 tcp dpt:80
2 0 0 DROP 0 -- !docker0 docker0 0.0.0.0/0 0.0.0.0/0
3 0 0 DROP 0 -- !bridge1 bridge1 0.0.0.0/0 0.0.0.0/0

Chain DOCKER-ISOLATION-STAGE-1 (1 references)
num pkts bytes target prot opt in out source destination
Expand Down Expand Up @@ -70,6 +72,8 @@ The filter table is updated as follows:
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A DOCKER -d 192.0.2.2/32 ! -i bridge1 -o bridge1 -p tcp -m tcp --dport 80 -j ACCEPT
-A DOCKER ! -i docker0 -o docker0 -j DROP
-A DOCKER ! -i bridge1 -o bridge1 -j DROP
-A DOCKER-ISOLATION-STAGE-1 -i bridge1 ! -o bridge1 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j RETURN
Expand All @@ -93,8 +97,14 @@ Note that:
to the container's address. This rule is added when the container is created
(unlike all the other rules so-far, which were created during driver or
network initialisation). [setPerPortForwarding][1]
- These per-port rules are inserted at the head of the chain, so that they
appear before the network's DROP rule [defaultDrop][2] which is always
appended to the end of the chain. In this case, because `docker0` was
created before `bridge1`, the `bridge1` rules appear above and below the
`docker0` DROP rule.

[1]: https://github.com/moby/moby/blob/675c2ac2db93e38bb9c5a6615d4155a969535fd9/libnetwork/drivers/bridge/port_mapping_linux.go#L795
[2]: https://github.com/robmry/moby/blob/52c89d467fc5326149e4bbb8903d23589b66ff0d/libnetwork/drivers/bridge/setup_ip_tables_linux.go#L252

And the corresponding nat table:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,7 @@ type iptCmdType = string
const (
iptCmdLFilter4 iptCmdType = "LFilter4"
iptCmdSFilter4 iptCmdType = "SFilter4"
iptCmdLFilterDocker4 iptCmdType = "LFilterDocker4"
iptCmdSFilterForward4 iptCmdType = "SFilterForward4"
iptCmdSFilterDocker4 iptCmdType = "SFilterDocker4"
iptCmdLNat4 iptCmdType = "LNat4"
Expand All @@ -152,6 +153,7 @@ var iptCmds = map[iptCmdType][]string{
iptCmdLFilter4: {"iptables", "-nvL", "--line-numbers", "-t", "filter"},
iptCmdSFilter4: {"iptables", "-S", "-t", "filter"},
iptCmdSFilterForward4: {"iptables", "-S", "FORWARD"},
iptCmdLFilterDocker4: {"iptables", "-nvL", "DOCKER", "--line-numbers", "-t", "filter"},
iptCmdSFilterDocker4: {"iptables", "-S", "DOCKER"},
iptCmdLNat4: {"iptables", "-nvL", "--line-numbers", "-t", "nat"},
iptCmdSNat4: {"iptables", "-S", "-t", "nat"},
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,12 @@ But, when ICC is disabled, rule 6 is DROP, so it would need to be placed before
rule 5. Because the rules are generated in different places, that's a slightly
bigger change than it should be._

The DOCKER chain is empty, because there are no containers with port mappings yet.
The DOCKER chain has a single DROP rule for the bridge network, to drop any
packets routed to the network that have not originated in the network. Added by
[defaultDrop][21].
_This means there is no dependency on the filter-FORWARD chain's default policy.
Even if it is ACCEPT, packets will be dropped unless container ports/protocols
are published._

The DOCKER-ISOLATION chains implement inter-network isolation, all (unrelated)
packets are processed by these chains. The rule are inserted at the head of the
Expand All @@ -70,6 +75,7 @@ chain when a network is created, in [setINC][20].
packets that are destined for any other network are dropped.

[20]: https://github.com/moby/moby/blob/333cfa640239153477bf635a8131734d0e9d099d/libnetwork/drivers/bridge/setup_ip_tables_linux.go#L369
[21]: https://github.com/robmry/moby/blob/52c89d467fc5326149e4bbb8903d23589b66ff0d/libnetwork/drivers/bridge/setup_ip_tables_linux.go#L252

Table nat:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Running the daemon with the userland proxy disabled then, as before, adding a ne
--subnet 192.0.2.0/24 --gateway 192.0.2.1 bridge1
docker run --network bridge1 -p 8080:80 --name c1 busybox

The filter table is the same as with the userland proxy enabled.
The filter table is largely the same as with the userland proxy enabled.

_Note that this means inter-network communication is disabled as-normal so,
although published ports will be directly accessible from a remote host
Expand All @@ -24,6 +24,28 @@ on the same host._

</details>

However, a rule is added by [setICMP][5] to the DOCKER chain (shown below) to
allow ICMP. The equivalent IPv6 rule uses `-p icmpv6` rather than `-p icmp`,
so *ALL* ICMP message types are allowed.

_The ACCEPT rule as shown by `iptables -L` looks alarming until you spot that it's
for `prot 1`._

Because the ICMP rule (rule 3) is per-network, it is appended to the chain along
with the default-DROP rule (rule 4). So, it is likely to be separated from
per-port/protocol ACCEPT rules for published ports on the same network. But it
will always appear before the default-DROP.

_[RFC 4890 section 4.3][6] makes recommendations for filtering ICMPv6. These
have been considered, but the host firewall is not a network boundary in the
sense used by the RFC. So, Node Information and Router Renumbering messages are
not discarded, and experimental/unused types are allowed because they may be
needed._

{{index . "LFilterDocker4"}}

{{index . "SFilterDocker4"}}

The nat table is:

{{index . "LNat4"}}
Expand Down Expand Up @@ -51,3 +73,5 @@ _And, the userland proxy won't be started for mapped ports._
[2]: https://github.com/moby/moby/blob/333cfa640239153477bf635a8131734d0e9d099d/libnetwork/drivers/bridge/setup_ip_tables_linux.go#L294
[3]: https://github.com/moby/moby/blob/675c2ac2db93e38bb9c5a6615d4155a969535fd9/libnetwork/drivers/bridge/port_mapping_linux.go#L477-L479
[4]: https://github.com/moby/moby/blob/333cfa640239153477bf635a8131734d0e9d099d/libnetwork/drivers/bridge/setup_ip_tables_linux.go#L290
[5]: https://github.com/robmry/moby/blob/d456d79cfc12cd7c801eebce0550b645c5343ca6/libnetwork/drivers/bridge/setup_ip_tables_linux.go#L390-L395
[6]: https://www.rfc-editor.org/rfc/rfc4890#section-4.3
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,14 @@ Note that:
to the container's address. This rule is added when the container is created
(unlike all the other rules so-far, which were created during driver or
network initialisation). [setPerPortForwarding][1]
- These per-port rules are inserted at the head of the chain, so that they
appear before the network's DROP rule [defaultDrop][2] which is always
appended to the end of the chain. In this case, because `docker0` was
created before `bridge1`, the `bridge1` rules appear above and below the
`docker0` DROP rule.

[1]: https://github.com/moby/moby/blob/675c2ac2db93e38bb9c5a6615d4155a969535fd9/libnetwork/drivers/bridge/port_mapping_linux.go#L795
[2]: https://github.com/robmry/moby/blob/52c89d467fc5326149e4bbb8903d23589b66ff0d/libnetwork/drivers/bridge/setup_ip_tables_linux.go#L252

And the corresponding nat table:

Expand Down
Loading

0 comments on commit 87365d9

Please sign in to comment.