Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

significant error log spam if a daemon is started before mgmtd #17931

Open
eqvinox opened this issue Jan 27, 2025 · 4 comments
Open

significant error log spam if a daemon is started before mgmtd #17931

eqvinox opened this issue Jan 27, 2025 · 4 comments

Comments

@eqvinox
Copy link
Contributor

eqvinox commented Jan 27, 2025

Starting zebra before mgmtd is up and running results in a whole lot of the following:

2025/01/27 16:01:12 ZEBRA: [GK6BW-E28RD] NB_OP_CHANGE: oper_walk_done: ERROR: Error sending notification message for path: /frr-interface:lib/interface[name="dummy0"]/state
2025/01/27 16:01:12 ZEBRA: [X0ASY-1BHSQ] NB_OP_CHANGE: oper_walk_done: ERROR: Error notifying for datastore path: /frr-interface:lib/interface[name="dummy0"]/state: generic error
2025/01/27 16:01:12 ZEBRA: [YKHB9-ND03T] BE-client: msg_conn_send_msg: can't send message on closed connection
2025/01/27 16:01:12 ZEBRA: [GK6BW-E28RD] NB_OP_CHANGE: oper_walk_done: ERROR: Error sending notification message for path: /frr-interface:lib/interface[name="dummy0"]/state
2025/01/27 16:01:12 ZEBRA: [X0ASY-1BHSQ] NB_OP_CHANGE: oper_walk_done: ERROR: Error notifying for datastore path: /frr-interface:lib/interface[name="dummy0"]/state: generic error
2025/01/27 16:01:12 ZEBRA: [YKHB9-ND03T] BE-client: msg_conn_send_msg: can't send message on closed connection
2025/01/27 16:01:12 ZEBRA: [GK6BW-E28RD] NB_OP_CHANGE: oper_walk_done: ERROR: Error sending notification message for path: /frr-interface:lib/interface[name="dummy0"]/state
2025/01/27 16:01:12 ZEBRA: [X0ASY-1BHSQ] NB_OP_CHANGE: oper_walk_done: ERROR: Error notifying for datastore path: /frr-interface:lib/interface[name="dummy0"]/state: generic error
2025/01/27 16:01:12 ZEBRA: [YKHB9-ND03T] BE-client: msg_conn_send_msg: can't send message on closed connection
2025/01/27 16:01:12 ZEBRA: [GK6BW-E28RD] NB_OP_CHANGE: oper_walk_done: ERROR: Error sending notification message for path: /frr-interface:lib/interface[name="dummy0"]/state
2025/01/27 16:01:12 ZEBRA: [X0ASY-1BHSQ] NB_OP_CHANGE: oper_walk_done: ERROR: Error notifying for datastore path: /frr-interface:lib/interface[name="dummy0"]/state: generic error

Depending on the setup, the amount of log spam generated here can be significant.

⇒ it is unclear to me why this is an error to begin with, but also if it is intentional it should be reported only once for each "down" transition of the notification connection (i.e. sticky flag that this error was reported which is cleared when NB connection is successfully established). Also these error messages are unhelpful in identifying the actual issue (mgmtd not fully running yet)

reproducer: just start zebra without mgmtd

@donaldsharp
Copy link
Member

Given that none of our supported start up scripts start zebra before mgmtd, this feels more like a developer problem than a operator problem.

@eqvinox
Copy link
Contributor Author

eqvinox commented Jan 28, 2025

Given that none of our supported start up scripts start zebra before mgmtd, this feels more like a developer problem than a operator problem.

Yes and no, for one I'm not sure we don't have a race condition here (-d startup semantics with holding the fork parent until we're done should address this), but also it seems like this will happen if mgmtd gets restarted for some reason.

(mgmtd getting restarted is supported AFAIK? Not sure what the interdependency status here is; zebra really can't be restarted without restarting everything else…)

@eqvinox
Copy link
Contributor Author

eqvinox commented Jan 28, 2025

our supported start up scripts

It is also a bit foolish to believe people won't muck with this. Things like OpenWRT and people using other service managers (runit, s6, dinit, etc.) exist, and some of them have integration requirements that flat out don't work with our init scripts (e.g. directly managing non-daemonized child processes).

@mjstapp
Copy link
Contributor

mjstapp commented Jan 28, 2025

I'm with David here - it'd be better to have a clearer report, like "I can't find mgmtd, I'll keep trying."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants