-
Notifications
You must be signed in to change notification settings - Fork 789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nftables support for ipmasq and portmap #935
Conversation
d6e2a46
to
60eb4e3
Compare
sigh, this is what coreos/go-iptables#107 was supposed to fix but I guess I fixed it for iptables-legacy but not iptables-nft? (Was this not failing before? I'm not sure why this change would make the race condition suddenly start happening.) |
pkg/ip/ipmasq_linux.go
Outdated
@@ -23,7 +23,9 @@ import ( | |||
|
|||
// SetupIPMasq installs iptables rules to masquerade traffic | |||
// coming from ip of ipn and going outside of ipn | |||
func SetupIPMasq(ipn *net.IPNet, chain string, comment string) error { | |||
func SetupIPMasq(ipn *net.IPNet, pluginName, containerID string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not a lot but seems this is used externally too https://grep.app/search?q=SetupIPMasq
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, right. Do I have to add a new API rather than changing the old one? I don't see anything about API stability guarantees...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not familiar with this codebase , let's ask @squeed
c69144c
to
e48682c
Compare
linter fail, I see is still in draft, let me know if you want a review, |
OK, finally updated and tested:
(I guess this will need a docs update to the cni.dev repo?) |
/assign @squeed |
oh, and I could port this to use the same nftables library that spoofcheck uses, if you preferred. or port the spoofcheck code to use knftables like kube-proxy (and Calico and soon some other stuff) |
|
pkg/ip/ipmasq_nftables_linux.go
Outdated
|
||
// hashForContainer returns a unique hash identifying the rules for this container with | ||
// this plugin | ||
func hashForContainer(pluginName, containerID string) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an error in the iptables implementation of this -- this needs to hash (network, ifname, containerid).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pluginName
here is NetConf.Name
, which is what you mean by network
, right? (Yeah, pluginName
is a bad variable name... I was assuming it meant NetConf.Type
...)
Why do we need ifname
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need ifname?
Because it is allowed to add and remove the same container to the same network multiple times. In that case, the ifname is the only distinct identifier.
pkg/ip/ipmasq_nftables_linux.go
Outdated
} | ||
|
||
func setupIPMasqNFTablesWithInterface(nft knftables.Interface, ipn *net.IPNet, pluginName, containerID string) error { | ||
comment := hashForContainer(pluginName, containerID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The iptables implementation is wrong here -- we need to be able to find all masquerade rules for a given network. Can the comment include network name in plain text?
Looks basically good. I'd like to see GC support if possible, since it informs the design the API. Could you add that, even if the top-end API glue isn't there yet. |
…sNFTables Signed-off-by: Dan Winship <[email protected]>
Signed-off-by: Dan Winship <[email protected]>
Signed-off-by: Dan Winship <[email protected]>
a833549
to
043253e
Compare
Use `conditionsV4` and `conditionsV6` values that actually look like valid iptables conditions. Signed-off-by: Dan Winship <[email protected]>
Signed-off-by: Dan Winship <[email protected]>
@squeed this is ready for re-review btw; it includes an implementation of GC for nftables ipmasq, which isn't actually plumbed through, but at least gets unit tested. |
This adds nftables backends to the two remaining iptables-only bits in containernetworking/plugins; ipmasq (used by
ptp
andbridge
) andportmap
. For now, they are both set up to use iptables by default, unless you explicitly request nftables via the network config, or you run the plugin on a host that has nft installed but not iptables (eg, future RHEL 10 hosts).The
firewall
plugin currently has "iptables" and "firewalld" backends. In theory it could have a separate explicit "nftables" backend, but I didn't do that here.The ipmasq and portmap code implement nftables via https://github.com/danwinship/nftables, which is a library I started for the planned kube-proxy nftables backend, and also plan to use for various other nftables ports (eg, cri-o). It might move from
danwinship/
to somewhere more generic at some point, but it might not.After I started working on this branch, I discovered that the
bridge
plugin was actually already using nftables viapkg/link/spoofcheck.go
, using the https://github.com/networkplumbing/go-nft library. Using two separate nftables libraries is probably not great, so we may want to fix that eventually. The libraries have somewhat similar APIs up to a point, then diverge completely becausego-nft
uses the JSON API to/sbin/nft
and therefore has to use the (poorly-documented) JSON rule schema, whiledanwinship/nftables
uses the plain text API to/sbin/nft
and so uses "normal" nft rules. eg, compare:which in
danwinship/nftables
syntax would become:("Draft" PR because not fully-tested yet...)