-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
route: properly handle multiple kernel prefix routes with same dst #385
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -387,6 +387,39 @@ static uint32_t route_id_attrs_get(struct nl_object *obj) | |
if (route->rt_family == AF_MPLS) | ||
rv &= ~ROUTE_ATTR_PRIO; | ||
|
||
if (route->rt_protocol == RTPROT_KERNEL) { | ||
/* | ||
* If configuring Ip(v4) addresses for the same prefix on | ||
* different interfaces, the kernel will install a | ||
* prefix route for each interface with the ip address | ||
* as preferred source. | ||
*/ | ||
if (route->rt_family == AF_INET && | ||
route->rt_scope == RT_SCOPE_LINK) | ||
rv |= ROUTE_ATTR_PREF_SRC; | ||
|
||
/* | ||
* For IPv6 addresses, the prefix routes will have | ||
* a single dev nexthop. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think IPv6 multihop routes need a special treatment to make sense with a concept of "identity" for route objects. It's clear that the next-hop is part of the identity. If you add a route A with a certain next hop, and then add another route B that only differs by their nexthop, then kernels The idea that one RTM_DELROUTE modifies an existing route's nexthop list (thereby creating another route altogether) makes IMO no sense. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. at least the IPv6LL routes are treated as unique routes and not a single ECMP route:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah right, and for some reason the priorities are different on my system (likely because one if on a VPN tun device), which is why before even shows them as different. |
||
*/ | ||
if (route->rt_family == AF_INET6 && route->rt_nr_nh == 1) { | ||
struct rtnl_nexthop *first; | ||
|
||
first = nl_list_first_entry(&route->rt_nexthops, | ||
struct rtnl_nexthop, | ||
rtnh_list); | ||
|
||
/* | ||
* Only interface, no other values, so force | ||
* nexthop comparison. | ||
*/ | ||
if (!rtnl_route_nh_get_gateway(first) && | ||
!rtnl_route_nh_get_via(first) && | ||
!rtnl_route_nh_get_newdst(first)) | ||
rv |= ROUTE_ATTR_MULTIPATH; | ||
} | ||
} | ||
|
||
return rv; | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not special to
proto kernel
routes.I agree, that "what is the ID of a route" needs to be adjusted. I think it should just follow what NetworkManager does (here and here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's special in the case that only kernel can install "duplicate" routes for the same destination which have the same priority. Anybody else can only do that by setting different priorities (which is already part of the id attrs, so will be treated as unique routes).
At least in my limited testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you use
ip route append
or?ip route prepend
(EDIT:
prepend
doesn't exist)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I didn't try append, that indeed did work:
Still presented to user space as multiple routes though:
Likewise for IPv6:
=>
(I have no idea what the v* mean so I'm just cargo culting there).
So I would think the (stop-gap?) solution is to do what I currently do for IPv6 routes also for IPv4 routes and then always, regardless of source/protocol?
Also I'm trying to fix the case where the same prefix route is present on different interfaces, not multiple nexthops on a single interface. Where the kernel sends each prefix route as its own
RTM_NEWROUTE
, not as a single route with multiple nexthops (or a single nexthop group).Maybe I missed a explanatory comment there also, adding
ROUTE_ATTR_MULTIPATH
to the ID attributes is done to trigger a nexthop comparison byroute_compare()
, without it ignores the difference in the (single) nexthop and treats the routes as identical.So this is only as a "please also check that the nexthop (= inferface) is identical". Not meant that these routes are actually MULTIPATH routes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this special treatment for IPv6 ECMP/multhop does not apply to IPv4 routes. In IPv4, a route can have 1 or more next-hops, and those next-hops are just part of the identity of the route. If you add a second IPv4 route, that differs from the first only by the list of nexthops, then simply a new route is added (and you have two similar IPv4 routes with different nexthop lists). Quite easy. On RTM_NEWROUTE message corresponds to one route object.
here you added onlink-routes. I guess, this merging of next-hops does not happen then (as you see). Compare my example, where the nexthops are gateways, then the merging happens.
What maybe would be the right way to model the IPv6 behaviors, is that IPv6 route objects are always only singlehop. Except, that one RTM_NEWROUTE message then can notify about multiple(!) single-hop routes (instead of one multi-hop routes). I did that in NetworkManager. Anyway, that is not what libnl currently does. It would be quite a change in behavior. What it does instead, it tries to search the cache for "similar" routes, and mangle them according to nexthop updates. That seems a questionable approach. It also seems hard to do efficiently (we cannot do a dictionary lookup, if the object's identity contains the next-hop list, but we don't know the object we search for).
I think the first step is to extend the set of ID attributes. I would just follow what NetworkManager does. We can also add one ID attribute at a time, if you don't want to bother to investigate them all at once. What I think was wrong, is to tie that to
if (route->rt_protocol == RTPROT_KERNEL)
.Yes. Because they are on-link, right?
that sounds right. Especially for IPv4.
And for IPv6, I think the proper solution would be that a route object always only has one next hop. In that case, honoring ROUTE_ATTR_MULTIPATH would also be correct (albeit there is nothing "multi"). I think in step 1, we can ignore that problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I guess there will be three types of routes:
And via
ip route append
you can have multiple routes out of these. E.g. I managed to setup:where I maybe even found some weird kernel behavior With this setup, if I do
So the kernel says no, but it didn't do nothing:
It added the route regardless, and deleted the one referencing nh_id 20 and the "single path" route with direct via.
Feels like it shouldn't modify the configuration and return an error. Will need to investigate if this behavior is intended/expected.