From f407abea6eccd792fc4058d56160cce5fb6ccaa6 Mon Sep 17 00:00:00 2001 From: Louis Scalbert Date: Sat, 14 Sep 2024 08:39:05 +0200 Subject: [PATCH] bgpd: fix missing addpath withdrawal race condition There is a race condition with addpath withdrawal. The withdrawal never happens if it was request when coalesce timer running. It can be demonstrated by adding to bgp_snmp_bgp4v2mib/rr/bgpd.conf: > router bgp 65004 > + coalesce-time 10000 But it also happens in other conditions. For example, when using "vtysh -f" to load the configuration. It results in having two identical paths but with different addpath on r2: > r2# sh bgp ipv4 10.0.0.0/31 > BGP routing table entry for 10.0.0.0/31, version 3 > Paths: (3 available, best #1, table default) > 65004 > 192.168.12.4 from 192.168.12.4 (192.168.12.4) > Origin IGP, metric 1, valid, external, multipath, best (AS Path) > AddPath ID: RX 0, TX-All 2 TX-Best-Per-AS 0 TX-Best-Selected 0 > Advertised to: 192.168.12.4 > Last update: Thu Sep 12 16:13:59 2024 > 65004 > 192.168.12.4 from 192.168.12.4 (192.168.12.4) > Origin IGP, metric 1, valid, external, multipath > AddPath ID: RX 3, TX-All 4 TX-Best-Per-AS 0 TX-Best-Selected 0 > Advertised to: 192.168.12.4 > Last update: Thu Sep 12 16:13:59 2024 The first path has been created first but should be withdrawn later. Withdraw the stale addpath even the coalesce timer is running. Signed-off-by: Louis Scalbert --- bgpd/bgp_updgrp_adv.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/bgpd/bgp_updgrp_adv.c b/bgpd/bgp_updgrp_adv.c index 250378af6845..b48455f6ea95 100644 --- a/bgpd/bgp_updgrp_adv.c +++ b/bgpd/bgp_updgrp_adv.c @@ -228,6 +228,12 @@ static int group_announce_route_walkcb(struct update_group *updgrp, void *arg) afi2str(afi), safi2str(safi), ctx->dest); UPDGRP_FOREACH_SUBGRP (updgrp, subgrp) { + /* withdraw stale addpath without waiting for the coalesce timer timeout. + * Otherwise, since adj->addpath_tx_id is overwritten, the code never + * notice anymore it has to do a withdrawal. + */ + if (addpath_capable) + subgrp_withdraw_stale_addpath(ctx, subgrp); /* * Skip the subgroups that have coalesce timer running. We will * walk the entire prefix table for those subgroups when the @@ -237,8 +243,6 @@ static int group_announce_route_walkcb(struct update_group *updgrp, void *arg) /* An update-group that uses addpath */ if (addpath_capable) { - subgrp_withdraw_stale_addpath(ctx, subgrp); - subgrp_announce_addpath_best_selected(ctx->dest, subgrp);