zebra: clear NEXTHOP_FLAG_LINKDOWN when interface comes up #20397
+47
−2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
When the kernel installs a route while the nexthop interface is down, it sends RTM_NEWROUTE with RTNH_F_LINKDOWN. Zebra copies this flag to NEXTHOP_FLAG_LINKDOWN. However, when the interface comes back up, the kernel does not send a netlink route update to clear this flag ; it sends RTM_NEWLINK.
This causes kernel routes to remain marked as "linkdown" in zebra even after the nexthop interface is operational.
Root Cause
The flag NEXTHOP_FLAG_LINKDOWN was added via c704cb4. It is passed from the kernel but is never cleared in FRR.
Fix
Have zebra track the interface operational state (IFF_LOWER_UP) and update NEXTHOP_FLAG_LINKDOWN accordingly in nexthop_active_check() for kernel and system routes. Also add NEXTHOP_FLAG_LINKDOWN to the NHE hash comparison so that changes to this flag result in a new NHE being created, and track linkdown changes to trigger ROUTE_ENTRY_CHANGED for proper NHE updates.
Big Picture
The nexthop kernel flag RTNH_F_LINKDOWN is set when the link goes down, and the nexthop is skipped during FIB lookup (if the sysctl flag ignore_routes_with_linkdown is set).
This is useful for cases like:
default via a.a.b.1 dev enp0s10 metric 20 onlink linkdown
default via x.x.x.49 dev wwx001e101f0000 metric 30
From FRR's point of view, this flag can be used to program hardware via dplane to stay in sync with kernel behavior.
Currently, however, it is just cosmetic (show command only).
Repro
This issue is easy to reproduce:
Create a p2p link in a down state
ip link add veth0 type veth peer name veth1
ip link set veth0 up
Add route while link is down
ip addr add 192.168.122.94/24 dev veth0
ip route add 192.168.100.0/24 via 192.168.122.1 dev veth0
Bring link up
ip link set veth1 up
Observe the stale linkdown flag:
K>* 192.168.100.0/24 [0/0] via 192.168.122.1, veth0 linkdown, weight 1, 00:00:54