Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prefix active in BGP, inactive in ZEBRA #17933

Open
2 tasks done
mksheera3 opened this issue Jan 27, 2025 · 2 comments
Open
2 tasks done

Prefix active in BGP, inactive in ZEBRA #17933

mksheera3 opened this issue Jan 27, 2025 · 2 comments
Labels
triage Needs further investigation

Comments

@mksheera3
Copy link

Description

Scenario:

BGP receives a prefix, say P1, resolves via a nexthop (recursively)
This nexthop, lets say NHR is configured directly into ZEBRA in our code.

In a scaled scenario, following sequence observed

  1. NHR route present (resolved/valid)
  2. NHR is deleted (withdrawn from ZEBRA)
  3. NHR is advertised to ZEBRA
  4. P1 is advertised.

Result: P1 resolved/valid in BGP, but inactive in ZEBRA.

ANALYSIS:
Behaviour for each step:
in step 1: after early processing, route is added to "META QUEUE" and waiting
in step 2: after early processing, try to add route to "META QUEUE" but fails as "rn" already present.
while NHT still awaits main processing, step 3 is kicked in
in step 3: P1 is received, BGP processes it, marks it valid (As NHR route was already available). BGP sends update to ZEBRA.
ZEBRA receives it and processes, and now it determines that NHR is inactive (REMOVED flag) and marks it inactive.

After sometime, META QUEUE process the input of NHR and finally marks it active; however this information it not passed back to clients (BGP) as there is no change in nexthop params. It however sends an ROUTE_UPDATE to kernel.

To recover from this state, need to withdraw and re-advertise NHR.

we are trying to fix it from our end, let me know if this was already known/fixed (though I have searched for existing issues, but couldn't find an exact match)

Version

stable/9.0

How to reproduce

mentioned in description

Expected behavior

route to be actuve in ZEBRA as well.

Actual behavior

route inactive in ZEBRA

Additional context

No response

Checklist

  • I have searched the open issues for this bug.
  • I have not included sensitive information in this report.
@mksheera3 mksheera3 added the triage Needs further investigation label Jan 27, 2025
@donaldsharp
Copy link
Member

No debugs, no examples of the problem state just description? It's clear that you've thought about the problem but this is really insufficient to do anything from our side.

@mksheera3
Copy link
Author

mksheera3 commented Jan 29, 2025

hi , Thanks for your reply.
regret for not attaching any logs, attaching now.

NHR is 169.254.10.3/32
Prefix P1 is 192.168.11.96/32

NHR is del/add, P1 add,

Result: P1 remains inactive in ZEBRA but valid/active in BGP.

cli-outputs.txt

bgp-debug-logs.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage Needs further investigation
Projects
None yet
Development

No branches or pull requests

2 participants