Skip to content

Conversation

@mike-dubrovsky
Copy link
Contributor

@mike-dubrovsky mike-dubrovsky commented Jan 7, 2026

Problem

When the kernel installs a route while the nexthop interface is down, it sends RTM_NEWROUTE with RTNH_F_LINKDOWN. Zebra copies this flag to NEXTHOP_FLAG_LINKDOWN. However, when the interface comes back up, the kernel does not send a netlink route update to clear this flag ; it sends RTM_NEWLINK.

This causes kernel routes to remain marked as "linkdown" in zebra even after the nexthop interface is operational.

Root Cause

The flag NEXTHOP_FLAG_LINKDOWN was added via c704cb4. It is passed from the kernel but is never cleared in FRR.

Fix

Have zebra track the interface operational state (IFF_LOWER_UP) and update NEXTHOP_FLAG_LINKDOWN accordingly in nexthop_active_check() for kernel and system routes. Also add NEXTHOP_FLAG_LINKDOWN to the NHE hash comparison so that changes to this flag result in a new NHE being created, and track linkdown changes to trigger ROUTE_ENTRY_CHANGED for proper NHE updates.

Big Picture

The nexthop kernel flag RTNH_F_LINKDOWN is set when the link goes down, and the nexthop is skipped during FIB lookup (if the sysctl flag ignore_routes_with_linkdown is set).

This is useful for cases like:

default via a.a.b.1 dev enp0s10 metric 20 onlink linkdown
default via x.x.x.49 dev wwx001e101f0000 metric 30

From FRR's point of view, this flag can be used to program hardware via dplane to stay in sync with kernel behavior.

Currently, however, it is just cosmetic (show command only).

Repro

This issue is easy to reproduce:

Create a p2p link in a down state
ip link add veth0 type veth peer name veth1
ip link set veth0 up
Add route while link is down
ip addr add 192.168.122.94/24 dev veth0
ip route add 192.168.100.0/24 via 192.168.122.1 dev veth0
Bring link up
ip link set veth1 up
Observe the stale linkdown flag:
K>* 192.168.100.0/24 [0/0] via 192.168.122.1, veth0 linkdown, weight 1, 00:00:54

When the kernel installs a route while the nexthop interface is down,
it sets RTNH_F_LINKDOWN on the route. Zebra copies this flag to
NEXTHOP_FLAG_LINKDOWN. However, when the interface comes back up,
the kernel does not send a route update to clear this flag.

This causes kernel routes to remain marked as "linkdown" in zebra
even after the nexthop interface is operational.

Fix this by having zebra track the interface operational state
(IFF_LOWER_UP) and update NEXTHOP_FLAG_LINKDOWN accordingly in
nexthop_active_check() for kernel and system routes. Also add
NEXTHOP_FLAG_LINKDOWN to the NHE hash comparison so that changes
to this flag result in a new NHE being created, and track linkdown
changes to trigger ROUTE_ENTRY_CHANGED for proper NHE updates.

Signed-off-by: Mike Dubrovsky <mdubrovs@cisco.com>
Copy link
Contributor

@mjstapp mjstapp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it sounds as if this should be handled in interface-change processing, not by adding some linux-specific code to nexthop_active_check() ?

@donaldsharp
Copy link
Member

I agree w/ Mark, this is state that can come from any dplane. In any event I would like to see a topotest that shows that this behavior is now working.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants