Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Babeld without VLAN on ethernet interfaces inside br-lan, replaces #600 #631

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions packages/lime-docs/files/lime-example
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ config lime network
option bmx7_over_batman false
option bmx7_pref_gw none # Force bmx7 to use a specific gateway to Internet (hostname must be used as identifier)
option bmx7_wifi_rate_max 'auto'
option babeld_over_batman false # When Babeld is run without VLAN (babeld:0), it runs on the bridge which includes Batman-adv's bat0, keeping this false avoids to have Babeld seeing all the nodes as direct neighbors due to Batman-adv. Set it to true just if Babeld is active only on a few border nodes.
option anygw_mac 'aa:aa:aa:%N1:%N2:aa' # Parametrizable with %Nn. Keep in mind that the ebtables rule will use a mask of ff:ff:ff:00:00:00 so br-lan will not forward anything coming in that matches the first 3 bytes of it's own anygw_mac (aa:aa:aa: by default)
# option autoap_enabled 0 # Requires lime-ap-watchping installed. If enabled AP SSID is changed to ERROR when network issues
# option autoap_hosts "8.8.8.8 141.1.1.1" # Requires lime-ap-watchping installed. Hosts used to check if the network is working fine
Expand Down
2 changes: 1 addition & 1 deletion packages/lime-proto-babeld/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ define Package/$(PKG_NAME)
CATEGORY:=LiMe
MAINTAINER:=Gioacchino Mazzurco <[email protected]>
URL:=https://libremesh.org
DEPENDS:=+babeld +lime-system
DEPENDS:=+babeld +lime-system +kmod-ebtables-ipv6
PKGARCH:=all
endef

Expand Down
75 changes: 68 additions & 7 deletions packages/lime-proto-babeld/files/usr/lib/lua/lime/proto/babeld.lua
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,18 @@ function babeld.configure(args)

uci:save("libremap")

--! If Babeld's Hello packets run over Batman-adv (whose bat0 is also
--! included in br-lan), all the Babeld nodes would appear as being direct
--! neighbors, so these Hello packets on bat0 have to be filtered
local babeldOverBatman = config.get_bool("network", "babeld_over_batman")
if utils.is_installed("kmod-batman-adv") and not babeldOverBatman then
fs.mkdir("/etc/firewall.lime.d")
fs.writefile("/etc/firewall.lime.d/21-babeld-not-over-bat0-ebtables",
"ebtables -t nat -A POSTROUTING -o bat0 -p ipv6"..
" --ip6-proto udp --ip6-sport 6696 --ip6-dport 6696 -j DROP\n")
else
fs.remove("/etc/firewall.lime.d/21-babeld-not-over-bat0-ebtables")
end
end

function babeld.setup_interface(ifname, args)
Expand All @@ -99,25 +111,74 @@ function babeld.setup_interface(ifname, args)

utils.log("lime.proto.babeld.setup_interface(...)", ifname)

local vlanId = args[2] or 17
local vlanId = tonumber(args[2]) or 17
local vlanProto = args[3] or "8021ad"
local nameSuffix = args[4] or "_babeld"


--! If Babeld is without VLAN (vlanId is 0) it should run directly on plain
--! ethernet interfaces, but the ones which are inside of the LAN bridge
--! (e.g. eth0 or eth0.1) cannot have an IPv6 Link-Local and Babeld needs it.
--! So Babeld has to run on the bridge interface br-lan
local isIntoLAN = false
local addIPtoIf = true
for _,v in pairs(args["deviceProtos"]) do
if v == "lan" then
isIntoLAN = true
--! would be weird to add a static IP to the WAN interface
elseif v == "wan" then
addIPtoIf = false
end
end

if ifname:match("^wlan") then
--! currently (2019-10-12) mode-ap and mode-apname have an hardcoded
--! "option network lan" so they are always in the br-lan bridge
if ifname:match("^wlan.*ap$") or ifname:match("^wlan.*apname$") then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the first part of this function there is the following code:

	if not args["specific"] and ifname:match("^wlan%d+.ap") then
		utils.log("lime.proto.babeld.setup_interface(...)", ifname, "ignored")
		return
	end

As far as I understand for ap and apname it won't reach this part of the code. Is it ok?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, by default the AP and APname interfaces are not used for the backbone and the routing.
For using an AP interface for routing with Babeld, APbb should be used, as implemented in #554 and documented in lime-example here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok maybe I am not understanding. In which conditions the program will get into the if of line 137?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean this if, right?

if ifname:match("^wlan.*ap$") or ifname:match("^wlan.*apname$") then

It gets hit when an AP or an APname is configured using interface-specific configuration (some documentation on the website here) and babeld is also selected in the interface-specific configuration.

An example of a case where configuring an AP with interface-specific configuration is here in #262 (5 GHz radio used just for mesh and 2.4 GHz radio used for AP+APname+mesh, which cannot currently be done with general configuration).

Something like: (the non-specified options in the wifi sections are taken from the generic wifi configuration, see db1c350. I don't know if more options should be specified for the net sections?)

config wifi radio0 # assuming that radio0 is 2.4 GHz
	list modes 'ap'
	list modes 'apname'
	list modes 'ieee80211s'

config net wireless2ap
	option linux_name 'wlan0-ap'
	list protocols lan # this is hardcoded anyway in ap proto, adding also here just for clarity
	list protocols babeld:17

config net wireless2apname
	option linux_name 'wlan0-apname'
	list protocols lan # this is hardcoded anyway in apname proto, adding also here just for clarity
	list protocols babeld:17

config net wireless2mesh
	option linux_name 'wlan0-mesh'
	list protocols babeld:17

config wifi radio1 # assuming that radio1 is 5 GHz
	list modes 'ieee80211s'

config net wireless5mesh
	option linux_name 'wlan1-mesh'
	list protocols babeld:17

(beware, I did not test this example)

isIntoLAN = true

--! all the WLAN interfaces are ignored by proto-lan
--! so they are not in the bridge even if proto-lan is present
--! (except mode-ap and mode-apname as mentioned above)
else
isIntoLAN = false
end
end

if vlanId == 0 and isIntoLAN then
utils.log("Rather than "..ifname..
", adding br-lan into Babeld interfaces")
ifname = "br-lan"
--! br-lan has already an IPv4, no need to add it
addIPtoIf = false
end

local owrtInterfaceName, linuxVlanIfName, owrtDeviceName =
network.createVlanIface(ifname, vlanId, nameSuffix, vlanProto)

local ipv4, _ = network.primary_address()

local uci = config.get_uci_cursor()

if(vlanId ~= 0 and ifname:match("^eth")) then
uci:set("network", owrtDeviceName, "mtu", tostring(network.MTU_ETH_WITH_VLAN))
end

uci:set("network", owrtInterfaceName, "proto", "static")
uci:set("network", owrtInterfaceName, "ipaddr", ipv4:host():string())
uci:set("network", owrtInterfaceName, "netmask", "255.255.255.255")
uci:save("network")
if addIPtoIf then
local ipv4, _ = network.primary_address()
--! the "else" way should always work but it fails in a weird way
--! with some wireless interfaces without VLAN
--! (e.g. works with wlan0-mesh and fails with wlan1-mesh)
--! so for these cases, the first way is used
--! (which indeed fails for most of the other cases)
if ifname:match("^wlan") and tonumber(vlanId) == 0 then
uci:set("network", owrtInterfaceName, "ifname", "@"..owrtDeviceName)
else
uci:set("network", owrtInterfaceName, "ifname", linuxVlanIfName)
end
Comment on lines +172 to +176
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spiccinini @G10h4ck @nicopace @gmarcos87
Please review this, I think is the most likely-to-be-suboptimal part of this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is not a new issue, we have workarounds for similar thing in other places of the code, it seems that when one uses linuxVlanIfName this trigger a race condition inside netifd due to some unsolved upstream bug, and that causes unreliable behaviour, if with this workaround it works well in all cases (I assume you tested it a lot) I am ok to merge it!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did test it on 5 different routers (listed in the first comment in the PR), but more testing would also be nice to have :)
Can you point me to the other places in the code where such a workaround is used (so that I can compare solutions)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

babeld_over_batman option should not be removed, is what permit us to eventually setup and test babel only on frontier nodes, as per the original libremesh scalability proposal

@G10h4ck ah... Rly? This scenario is the number 1 feature in LibreMesh and I did not understand how it should work until now. More documentation is needed (this is not explained neither in the How it works). I will add that option back as soon as possible.

To implement libremesh full proposal is not a simple task, so we started from the easier part, which is solvable more or less just by generating a static configuration when lime-config is run. The other parts of the proposal (for example automatically turn on L3 routing protocol only on frontier nodes) needs more thins going on at run-time, and we are implementing those things bit by bit (shared-state is one of those things). In the meanwhile even if we don't have all those pieces in place is important to keep the pieces we already have compatible with that scenario, so when we get there it is not too overwhelming to make it work.

I just added back the babeld_over_batman option, I just took the code for it from #600 so it should work (not tested now, but I tested some times ago in the previous PR).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spiccinini could you test this on LibreRouter? Two scenarios should be tested: with VLAN and without VLAN.
To verify that it works: issue echo dump | nc ::1 30003 and check if all the included interfaces (apart from the WAN one) have an IPv4 in both VLAN and noVLAN configurations. In the noVLAN configuration the br-lan interface has to be present in the list of the Babeld interfaces.

Copy link
Contributor

@spiccinini spiccinini Nov 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I used the code from this PR (rebased to master). Maybe I did a mistake. I deleted both /etc/config wireless and network, regenerated them with generate_config, then lime-config and lime-apply,

Copy link
Member Author

@ilario ilario Nov 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not familiar with config_generate and I never trusted lime-apply...
Maybe #75 is related?
@spiccinini could you try without deleting the config files, and just running lime-config and rebooting? Just for being sure...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I don't delete the config the vlan 17 still exists after lime-config

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow... Even after the reboot? So something is failing baaaad.
Can you post the full /etc/config/network with in the two cases of VLAN set in /etc/config/lime
and VLAN set to zero in /etc/config/lime?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did more testing and for me it works. Can you confirm that it fails with LibreRouter or with your compilation procedure?
Regarding the tests, by myself I am not able to write them, can you do it or guide me?

uci:set("network", owrtInterfaceName, "proto", "static")
uci:set("network", owrtInterfaceName, "ipaddr", ipv4:host():string())
uci:set("network", owrtInterfaceName, "netmask", "255.255.255.255")
uci:save("network")
end

uci:set("babeld", owrtInterfaceName, "interface")
uci:set("babeld", owrtInterfaceName, "ifname", linuxVlanIfName)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ config lime network
option bmx7_over_batman false
option bmx7_pref_gw none
option bmx7_wifi_rate_max 'auto'
option babeld_over_batman false
option anygw_mac "aa:aa:aa:%N1:%N2:aa"
option use_odhcpd false

Expand Down
2 changes: 2 additions & 0 deletions packages/lime-system/files/usr/lib/lua/lime/network.lua
Original file line number Diff line number Diff line change
Expand Up @@ -280,6 +280,8 @@ function network.configure()
flags["_specific_section"] = owrtIf
end

flags["deviceProtos"] = deviceProtos

for _,protoParams in pairs(deviceProtos) do
local args = utils.split(protoParams, network.protoParamsSeparator)
if args[1] == "manual" then break end -- If manual is specified do not configure interface
Expand Down