Skip to content
This repository has been archived by the owner on Jun 18, 2024. It is now read-only.

Sync with scx repo #213

Merged
merged 1 commit into from
May 26, 2024
Merged

Sync with scx repo #213

merged 1 commit into from
May 26, 2024

Conversation

vax-r
Copy link
Contributor

@vax-r vax-r commented May 26, 2024

Description

Sync with SCX repository.

Signed-off-by: I Hsin Cheng <[email protected]>
@htejun htejun merged commit 7648df7 into sched-ext:sched_ext May 26, 2024
1 check passed
arighi pushed a commit to arighi/sched_ext that referenced this pull request Jun 7, 2024
BugLink: https://bugs.launchpad.net/bugs/2068087

[ Upstream commit 0f022d3 ]

When the mirred action is used on a classful egress qdisc and a packet is
mirrored or redirected to self we hit a qdisc lock deadlock.
See trace below.

[..... other info removed for brevity....]
[   82.890906]
[   82.890906] ============================================
[   82.890906] WARNING: possible recursive locking detected
[   82.890906] 6.8.0-05205-g77fadd89fe2d-dirty sched-ext#213 Tainted: G        W
[   82.890906] --------------------------------------------
[   82.890906] ping/418 is trying to acquire lock:
[   82.890906] ffff888006994110 (&sch->q.lock){+.-.}-{3:3}, at:
__dev_queue_xmit+0x1778/0x3550
[   82.890906]
[   82.890906] but task is already holding lock:
[   82.890906] ffff888006994110 (&sch->q.lock){+.-.}-{3:3}, at:
__dev_queue_xmit+0x1778/0x3550
[   82.890906]
[   82.890906] other info that might help us debug this:
[   82.890906]  Possible unsafe locking scenario:
[   82.890906]
[   82.890906]        CPU0
[   82.890906]        ----
[   82.890906]   lock(&sch->q.lock);
[   82.890906]   lock(&sch->q.lock);
[   82.890906]
[   82.890906]  *** DEADLOCK ***
[   82.890906]
[..... other info removed for brevity....]

Example setup (eth0->eth0) to recreate
tc qdisc add dev eth0 root handle 1: htb default 30
tc filter add dev eth0 handle 1: protocol ip prio 2 matchall \
     action mirred egress redirect dev eth0

Another example(eth0->eth1->eth0) to recreate
tc qdisc add dev eth0 root handle 1: htb default 30
tc filter add dev eth0 handle 1: protocol ip prio 2 matchall \
     action mirred egress redirect dev eth1

tc qdisc add dev eth1 root handle 1: htb default 30
tc filter add dev eth1 handle 1: protocol ip prio 2 matchall \
     action mirred egress redirect dev eth0

We fix this by adding an owner field (CPU id) to struct Qdisc set after
root qdisc is entered. When the softirq enters it a second time, if the
qdisc owner is the same CPU, the packet is dropped to break the loop.

Reported-by: Mingshuai Ren <[email protected]>
Closes: https://lore.kernel.org/netdev/[email protected]/
Fixes: 3bcb846 ("net: get rid of spin_trylock() in net_tx_action()")
Fixes: e578d9c ("net: sched: use counter to break reclassify loops")
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: Victor Nogueira <[email protected]>
Reviewed-by: Pedro Tammela <[email protected]>
Tested-by: Jamal Hadi Salim <[email protected]>
Acked-by: Jamal Hadi Salim <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Portia Stephens <[email protected]>
Signed-off-by: Stefan Bader <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants