scx: Close a small race window in the enable path which can lead to use-after-free #205

htejun · 2024-05-17T23:44:47Z

scx_ops_enable() was iterating all tasks without locking its rq and then calling get_task_struct() on them. However, this can race against task free path. The task struct itself is accessible but its refcnt may already have reached zero by the time get_task_struct() is called. This triggers the following refcnt warning.

WARNING: CPU: 11 PID: 2834521 at lib/refcount.c:25 refcount_warn_saturate+0x7d/0xe0
...
RIP: 0010:refcount_warn_saturate+0x7d/0xe0
...
Call Trace:
bpf_scx_reg+0x851/0x890
bpf_struct_ops_link_create+0xbd/0xe0
__sys_bpf+0x2862/0x2d20
__x64_sys_bpf+0x18/0x20
do_syscall_64+0x3d/0x90
entry_SYSCALL_64_after_hwframe+0x46/0xb0

This can also affect other loops as it can call sched functions on tasks which are already destroyed. Let's fix it by making scx_task_iter_filtered_locked() also filter out TASK_DEAD tasks and switching the scx_task_iter_filtered() usage in scx_ops_enable() to scx_task_iter_filtered_locked(). This makes the next function always test TASK_DEAD while holding the task's rq lock and the rq lock guarantees that the task won't be freed closing the race window.

…se-after-free scx_ops_enable() was iterating all tasks without locking its rq and then calling get_task_struct() on them. However, this can race against task free path. The task struct itself is accessible but its refcnt may already have reached zero by the time get_task_struct() is called. This triggers the following refcnt warning. WARNING: CPU: 11 PID: 2834521 at lib/refcount.c:25 refcount_warn_saturate+0x7d/0xe0 ... RIP: 0010:refcount_warn_saturate+0x7d/0xe0 ... Call Trace: bpf_scx_reg+0x851/0x890 bpf_struct_ops_link_create+0xbd/0xe0 __sys_bpf+0x2862/0x2d20 __x64_sys_bpf+0x18/0x20 do_syscall_64+0x3d/0x90 entry_SYSCALL_64_after_hwframe+0x46/0xb0 This can also affect other loops as it can call sched functions on tasks which are already destroyed. Let's fix it by making scx_task_iter_filtered_locked() also filter out TASK_DEAD tasks and switching the scx_task_iter_filtered() usage in scx_ops_enable() to scx_task_iter_filtered_locked(). This makes the next function always test TASK_DEAD while holding the task's rq lock and the rq lock guarantees that the task won't be freed closing the race window.

Byte-Lab

Nice debugging

htejun requested a review from Byte-Lab May 17, 2024 23:44

Byte-Lab approved these changes May 18, 2024

View reviewed changes

Byte-Lab merged commit fb1406f into sched_ext May 18, 2024
1 check passed

Byte-Lab deleted the htejun/fix-task-iter-race branch May 18, 2024 01:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scx: Close a small race window in the enable path which can lead to use-after-free #205

scx: Close a small race window in the enable path which can lead to use-after-free #205

htejun commented May 17, 2024

Byte-Lab left a comment

scx: Close a small race window in the enable path which can lead to use-after-free #205

scx: Close a small race window in the enable path which can lead to use-after-free #205

Conversation

htejun commented May 17, 2024

Byte-Lab left a comment

Choose a reason for hiding this comment