Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock while remounting nested bound mountpoint #40

Open
arko-pl opened this issue Jun 17, 2024 · 13 comments
Open

Deadlock while remounting nested bound mountpoint #40

arko-pl opened this issue Jun 17, 2024 · 13 comments

Comments

@arko-pl
Copy link

arko-pl commented Jun 17, 2024

Hello,

I noticed a deadlock while trying to remount nested and bound mountpoint. It is reproducible with different underlying filesystems.
Steps to reproduce (assuming /union is a simple aufs r/w mount point):

mkdir /union/mnt/new
mount -o bind /union /union/mnt/new
mount -o remount /union/mnt/new
after last command console is locked.

At the time of writing last kernel that works is 5.4. Issue is observed also on 5.15 LTS kernel and 6.1 LTS kernel.

There is dump of tasks from the 6.1.92 kernel:

<4>[  335.001920][T16359] ------------[ cut here ]------------
<4>[  335.002811][T16359] DEBUG_RWSEMS_WARN_ON((rwsem_owner(sem) != current) && !rwsem_test_oflags(sem, RWSEM_NONSPINNABLE)): count = 0x0, magic = 0xffff88810fde8050, owner = 0x1, curr 0xffff88800dba8000, list empty
<4>[  335.005549][T16359] WARNING: CPU: 3 PID: 16359 at kernel/locking/rwsem.c:1371 up_write+0x124/0x190
<4>[  335.007259][T16359] Modules linked in: sch_htb iscsi_scst(O) scst_vdisk(O) scst(O) dlm iptable_filter bpfilter target_core_iblock target_core_pscsi iscsi_target_mod target_core_mod bonding ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi fuse e1000 button nls_cp437 nls_iso8859_1 virtio_scsi virtio_pci virtio_pci_legacy_dev virtio_pci_modern_dev sg virtio virtio_ring pata_acpi ata_generic vfat fat aufs scsi_transport_fc
<4>[  335.013543][T16359] CPU: 3 PID: 16359 Comm: mount Kdump: loaded Tainted: G           O       6.1.92 #16 87b4303e5ae2d4a7f6c994aa88151842d80c2918
<4>[  335.015889][T16359] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
<4>[  335.017275][T16359] RIP: 0010:up_write+0x124/0x190
<4>[  335.018216][T16359] Code: f8 38 99 82 c6 05 18 38 0c 02 01 48 39 c2 48 c7 c2 89 38 99 82 48 c7 c0 29 12 9a 82 48 0f 45 c2 48 8b 55 00 50 e8 8c 0b f9 ff <0f> 0b 59 eb 8f 48 8b 55 58 48 8d 45 58 4c 8b 45 08 48 c7 c6 8e 38
<4>[  335.021080][T16359] RSP: 0018:ffff888011857d80 EFLAGS: 00010282
<4>[  335.021968][T16359] RAX: 0000000000000000 RBX: ffff88810c958750 RCX: ffff8881549ac6c8
<4>[  335.023136][T16359] RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff8881549ac6c0
<4>[  335.024292][T16359] RBP: ffff88810fde8050 R08: ffffffff82fe9868 R09: 0000000000027ffb
<4>[  335.025450][T16359] R10: 0000000000000293 R11: ffffffff82f29880 R12: ffff88810cf64800
<4>[  335.026654][T16359] R13: ffff88800dba8000 R14: 000000000000000b R15: ffff88810cf6b000
<4>[  335.028031][T16359] FS:  00007f2529cc7840(0000) GS:ffff888154980000(0000) knlGS:0000000000000000
<4>[  335.029338][T16359] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  335.030503][T16359] CR2: 00007fffe6aa9f80 CR3: 000000000eec0000 CR4: 00000000000006e0
<4>[  335.031703][T16359] Call Trace:
<4>[  335.032379][T16359]  <TASK>
<4>[  335.033195][T16359]  ? __warn+0x79/0xc0
<4>[  335.033902][T16359]  ? up_write+0x124/0x190
<4>[  335.034654][T16359]  ? report_bug+0xf9/0x130
<4>[  335.035439][T16359]  ? prb_read_valid+0x17/0x20
<4>[  335.036243][T16359]  ? handle_bug+0x42/0x70
<4>[  335.037085][T16359]  ? exc_invalid_op+0x14/0x70
<4>[  335.038096][T16359]  ? asm_exc_invalid_op+0x16/0x20
<4>[  335.038976][T16359]  ? up_write+0x124/0x190
<4>[  335.039735][T16359]  di_write_unlock+0x2b/0x40 [aufs 02e4ef5b7c7da454108981099dd5cc5175c944e7]
<4>[  335.041177][T16359]  au_remount_refresh+0xe4/0x500 [aufs 02e4ef5b7c7da454108981099dd5cc5175c944e7]
<4>[  335.042754][T16359]  au_fsctx_reconfigure+0x11e/0x130 [aufs 02e4ef5b7c7da454108981099dd5cc5175c944e7]
<4>[  335.044362][T16359]  reconfigure_super+0xcb/0x280
<4>[  335.045193][T16359]  path_mount+0x986/0xb70
<4>[  335.045936][T16359]  __x64_sys_mount+0x107/0x140
<4>[  335.046805][T16359]  do_syscall_64+0x3b/0x90
<4>[  335.047821][T16359]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
<4>[  335.048966][T16359] RIP: 0033:0x7f2529390d8a
<4>[  335.049722][T16359] Code: 48 8b 0d 01 c1 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ce c0 2b 00 f7 d8 64 89 01 48
<4>[  335.052880][T16359] RSP: 002b:00007ffdca417ac8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
<4>[  335.054215][T16359] RAX: ffffffffffffffda RBX: 00000000017b9060 RCX: 00007f2529390d8a
<4>[  335.055709][T16359] RDX: 00000000017b92d0 RSI: 00000000017c6030 RDI: 00000000017c6010
<4>[  335.057014][T16359] RBP: 0000000000000000 R08: 00000000017c1a80 R09: 00007f25292f399a
<4>[  335.058483][T16359] R10: 0000000000200030 R11: 0000000000000202 R12: 00000000017c6010
<4>[  335.059807][T16359] R13: 00000000017b92d0 R14: 0000000000000000 R15: 0000000000000003
<4>[  335.061067][T16359]  </TASK>
<4>[  335.061616][T16359] irq event stamp: 0
<4>[  335.062445][T16359] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
<4>[  335.063590][T16359] hardirqs last disabled at (0): [<ffffffff8108a7cb>] copy_process+0x90b/0x1eb0
<4>[  335.065005][T16359] softirqs last  enabled at (0): [<ffffffff8108a7cb>] copy_process+0x90b/0x1eb0
<4>[  335.066414][T16359] softirqs last disabled at (0): [<0000000000000000>] 0x0
<4>[  335.067811][T16359] ---[ end trace 0000000000000000 ]---
<3>[  365.386687][   T36] INFO: task snmpd:21963 blocked for more than 30 seconds.
<3>[  365.391150][   T36]       Tainted: G        W  O       6.1.92 #16
<3>[  365.394968][   T36] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
<6>[  365.398658][   T36] task:snmpd           state:D stack:0     pid:21963 ppid:1      flags:0x00000000
<6>[  365.401369][   T36] Call Trace:
<6>[  365.401936][   T36]  <TASK>
<6>[  365.402466][   T36]  __schedule+0x2d7/0xad0
<6>[  365.403209][   T36]  schedule+0x57/0xa0
<6>[  365.403852][   T36]  schedule_preempt_disabled+0x11/0x20
<6>[  365.404700][   T36]  rwsem_down_read_slowpath+0x28d/0x580
<6>[  365.405555][   T36]  down_read+0x63/0x140
<6>[  365.406231][   T36]  si_read_lock+0x40/0x100 [aufs 02e4ef5b7c7da454108981099dd5cc5175c944e7]
<6>[  365.407586][   T36]  ? lock_acquire+0x138/0x2b0
<6>[  365.408623][   T36]  ? lock_acquire+0x138/0x2b0
<6>[  365.409407][   T36]  aufs_permission+0x44/0x2f0 [aufs 02e4ef5b7c7da454108981099dd5cc5175c944e7]
<6>[  365.410791][   T36]  ? lock_release+0x1ad/0x2d0
<6>[  365.411521][   T36]  inode_permission.part.0+0x113/0x1b0
<6>[  365.412370][   T36]  link_path_walk.part.0+0x2a9/0x3d0
<6>[  365.413209][   T36]  ? path_init+0x3b9/0x4b0
<6>[  365.413906][   T36]  path_openat+0xb2/0xaa0
<6>[  365.414601][   T36]  ? _raw_spin_unlock+0x1f/0x30
<6>[  365.415351][   T36]  do_filp_open+0xa9/0x150
<6>[  365.416042][   T36]  ? _raw_spin_unlock+0x1f/0x30
<6>[  365.416819][   T36]  ? alloc_fd+0x12d/0x200
<6>[  365.417800][   T36]  do_sys_openat2+0x9b/0x160
<6>[  365.418522][   T36]  __x64_sys_open+0x55/0xa0
<6>[  365.419243][   T36]  do_syscall_64+0x3b/0x90
<6>[  365.419923][   T36]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
<6>[  365.420805][   T36] RIP: 0033:0x7f1aa3fb3960
<6>[  365.421484][   T36] RSP: 002b:00007ffef1f9bac8 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
<6>[  365.422730][   T36] RAX: ffffffffffffffda RBX: 0000000001f91060 RCX: 00007f1aa3fb3960
<6>[  365.423890][   T36] RDX: 00000000000001b6 RSI: 0000000000000000 RDI: 00007f1aa4968a52
<6>[  365.425057][   T36] RBP: 00007ffef1f9bb50 R08: 0000000000000008 R09: 0000000000000001
<6>[  365.426225][   T36] R10: 00007f1aa4970c93 R11: 0000000000000246 R12: 0000000000000008
<6>[  365.427446][   T36] R13: 0000000000000001 R14: 00007ffef1f9c140 R15: 00007f1aa4f64b00
<6>[  365.428837][   T36]  </TASK>
<3>[  365.429398][   T36] INFO: task kbtest:9102 blocked for more than 30 seconds.
<3>[  365.430472][   T36]       Tainted: G        W  O       6.1.92 #16
<3>[  365.431759][   T36] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
<6>[  365.433029][   T36] task:kbtest          state:D stack:0     pid:9102  ppid:9083   flags:0x00000000
<6>[  365.434443][   T36] Call Trace:
<6>[  365.434975][   T36]  <TASK>
<6>[  365.435459][   T36]  __schedule+0x2d7/0xad0
<6>[  365.436130][   T36]  schedule+0x57/0xa0
<6>[  365.436818][   T36]  schedule_preempt_disabled+0x11/0x20
<6>[  365.437840][   T36]  rwsem_down_read_slowpath+0x28d/0x580
<6>[  365.438717][   T36]  down_read+0x63/0x140
<6>[  365.439362][   T36]  si_read_lock+0x40/0x100 [aufs 02e4ef5b7c7da454108981099dd5cc5175c944e7]
<6>[  365.440613][   T36]  ? lock_acquire+0x138/0x2b0
<6>[  365.441332][   T36]  ? lock_acquire+0x138/0x2b0
<6>[  365.442046][   T36]  aufs_permission+0x44/0x2f0 [aufs 02e4ef5b7c7da454108981099dd5cc5175c944e7]
<6>[  365.443367][   T36]  ? lock_release+0x1ad/0x2d0
<6>[  365.444083][   T36]  inode_permission.part.0+0x113/0x1b0
<6>[  365.444906][   T36]  link_path_walk.part.0+0x2a9/0x3d0
<6>[  365.445703][   T36]  ? path_init+0x3b9/0x4b0
<6>[  365.446397][   T36]  path_lookupat+0x43/0x1d0
<6>[  365.447310][   T36]  filename_lookup+0xcb/0x1d0
<6>[  365.448057][   T36]  ? lock_acquire+0x138/0x2b0
<6>[  365.448775][   T36]  ? kmem_cache_alloc+0x43a/0x530
<6>[  365.449548][   T36]  ? getname_flags.part.0+0x29/0x1b0
<6>[  365.453661][   T36]  ? strncpy_from_user+0x1e/0x140
<6>[  365.454478][   T36]  ? getname_flags.part.0+0x45/0x1b0
<6>[  365.455278][   T36]  user_path_at_empty+0x3a/0x60
<6>[  365.456020][   T36]  do_faccessat+0x7c/0x270
<6>[  365.456761][   T36]  do_syscall_64+0x3b/0x90
<6>[  365.457687][   T36]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
<6>[  365.458607][   T36] RIP: 0033:0x7f2b83cdfc57
<6>[  365.459376][   T36] RSP: 002b:00007ffda0325398 EFLAGS: 00000246 ORIG_RAX: 0000000000000015
<6>[  365.460642][   T36] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2b83cdfc57
<6>[  365.461831][   T36] RDX: 00000000006eb980 RSI: 0000000000000000 RDI: 00000000004bfb00
<6>[  365.463100][   T36] RBP: 00007ffda03253a0 R08: 0000000000000000 R09: 0000000000000000
<6>[  365.464288][   T36] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000407c5c
<6>[  365.465475][   T36] R13: 00007ffda03254b0 R14: 0000000000000000 R15: 0000000000000000
<6>[  365.466739][   T36]  </TASK>
<3>[  365.467482][   T36] INFO: task mount:16359 blocked for more than 30 seconds.
<3>[  365.468564][   T36]       Tainted: G        W  O       6.1.92 #16
<3>[  365.469877][   T36] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
<6>[  365.471146][   T36] task:mount           state:D stack:0     pid:16359 ppid:13291  flags:0x00004000
<6>[  365.472560][   T36] Call Trace:
<6>[  365.473106][   T36]  <TASK>
<6>[  365.473590][   T36]  __schedule+0x2d7/0xad0
<6>[  365.474274][   T36]  schedule+0x57/0xa0
<6>[  365.474894][   T36]  schedule_preempt_disabled+0x11/0x20
<6>[  365.475714][   T36]  rwsem_down_read_slowpath+0x28d/0x580
<6>[  365.476588][   T36]  down_read_nested+0x61/0x140
<6>[  365.477607][   T36]  di_read_lock+0x25/0x130 [aufs 02e4ef5b7c7da454108981099dd5cc5175c944e7]
<6>[  365.478878][   T36]  au_do_refresh+0x2e/0x80 [aufs 02e4ef5b7c7da454108981099dd5cc5175c944e7]
<6>[  365.480157][   T36]  au_remount_refresh+0x205/0x500 [aufs 02e4ef5b7c7da454108981099dd5cc5175c944e7]
<6>[  365.481566][   T36]  au_fsctx_reconfigure+0x11e/0x130 [aufs 02e4ef5b7c7da454108981099dd5cc5175c944e7]
<6>[  365.483030][   T36]  reconfigure_super+0xcb/0x280
<6>[  365.483774][   T36]  path_mount+0x986/0xb70
<6>[  365.484448][   T36]  __x64_sys_mount+0x107/0x140
<6>[  365.485214][   T36]  do_syscall_64+0x3b/0x90
<6>[  365.485903][   T36]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
<6>[  365.486863][   T36] RIP: 0033:0x7f2529390d8a
<6>[  365.487801][   T36] RSP: 002b:00007ffdca417ac8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
<6>[  365.489044][   T36] RAX: ffffffffffffffda RBX: 00000000017b9060 RCX: 00007f2529390d8a
<6>[  365.490242][   T36] RDX: 00000000017b92d0 RSI: 00000000017c6030 RDI: 00000000017c6010
<6>[  365.491414][   T36] RBP: 0000000000000000 R08: 00000000017c1a80 R09: 00007f25292f399a
<6>[  365.492607][   T36] R10: 0000000000200030 R11: 0000000000000202 R12: 00000000017c6010
<6>[  365.493778][   T36] R13: 00000000017b92d0 R14: 0000000000000000 R15: 0000000000000003
<6>[  365.494966][   T36]  </TASK>
<3>[  365.495472][   T36] INFO: task modprobe:16385 blocked for more than 30 seconds.
<3>[  365.496624][   T36]       Tainted: G        W  O       6.1.92 #16
<3>[  365.497976][   T36] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
<6>[  365.499280][   T36] task:modprobe        state:D stack:0     pid:16385 ppid:16123  flags:0x00000000
<6>[  365.500700][   T36] Call Trace:
<6>[  365.501246][   T36]  <TASK>
<6>[  365.501738][   T36]  __schedule+0x2d7/0xad0
<6>[  365.502448][   T36]  schedule+0x57/0xa0
<6>[  365.503075][   T36]  schedule_preempt_disabled+0x11/0x20
<6>[  365.503900][   T36]  rwsem_down_read_slowpath+0x28d/0x580
<6>[  365.504743][   T36]  down_read+0x63/0x140
<6>[  365.505393][   T36]  si_read_lock+0x40/0x100 [aufs 02e4ef5b7c7da454108981099dd5cc5175c944e7]
<6>[  365.506736][   T36]  aufs_lookup.part.0.isra.0+0x1b/0x230 [aufs 02e4ef5b7c7da454108981099dd5cc5175c944e7]
<6>[  365.508248][   T36]  __lookup_slow+0xff/0x1b0
<6>[  365.508962][   T36]  walk_component+0xe5/0x160
<6>[  365.509691][   T36]  path_lookupat+0x73/0x1d0
<6>[  365.510430][   T36]  filename_lookup+0xcb/0x1d0
<6>[  365.511154][   T36]  ? lock_acquire+0x138/0x2b0
<6>[  365.511874][   T36]  ? lock_release+0x1ad/0x2d0
<6>[  365.512593][   T36]  ? lock_release+0x1ad/0x2d0
<6>[  365.513310][   T36]  ? fs_reclaim_acquire+0x51/0xe0
<6>[  365.514081][   T36]  vfs_statx+0x8d/0x170
<6>[  365.514745][   T36]  vfs_fstatat+0x54/0x70
<6>[  365.515403][   T36]  __do_sys_newfstatat+0x26/0x60
<6>[  365.516152][   T36]  ? trace_hardirqs_on+0x2c/0xd0
<6>[  365.516925][   T36]  ? syscall_enter_from_user_mode+0x1d/0x50
<6>[  365.517864][   T36]  do_syscall_64+0x3b/0x90
<6>[  365.518556][   T36]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
<6>[  365.519467][   T36] RIP: 0033:0x7ff5b5ae167a
<6>[  365.520144][   T36] RSP: 002b:00007ffd21dde028 EFLAGS: 00000246 ORIG_RAX: 0000000000000106
<6>[  365.521362][   T36] RAX: ffffffffffffffda RBX: 000055696e6a921b RCX: 00007ff5b5ae167a
<6>[  365.522528][   T36] RDX: 00007ffd21dde180 RSI: 000055696e6a921b RDI: 0000000000000003
<6>[  365.523684][   T36] RBP: 000055696d21e054 R08: 0000000000000000 R09: 000000000000000e
<6>[  365.524843][   T36] R10: 0000000000000000 R11: 0000000000000246 R12: 000055696d426368
<6>[  365.526001][   T36] R13: 000055696e6a9120 R14: 000055696e6a9010 R15: 000055696e6a9208
<6>[  365.527230][   T36]  </TASK>
@sfjro
Copy link
Owner

sfjro commented Jun 17, 2024 via email

@arko-pl
Copy link
Author

arko-pl commented Jun 18, 2024

And did you install aufs-util.git on your system?

I got outdated version, but after update I'm also reproducing this issue.

@sfjro
Copy link
Owner

sfjro commented Jun 18, 2024 via email

@arko-pl
Copy link
Author

arko-pl commented Jun 19, 2024

Did you apply some other patches to your kernel?

No, only the standard set of patches was applied.
Here is how I'm reproducing this from scratch:

mkdir /union
mkdir /mnt/image
dd if=/dev/zero of=/tmp/image bs=1M count=8
mkfs.ext4 /tmp/image
mount -o loop /tmp/image /mnt/image
mount -t aufs -o br:/mnt/image=rw none /union
mkidr /union/usr
cp /bin/bash /union/usr/
mkdir -p /union/mnt/test
mount -o remount /union/mnt/test

@arko-pl arko-pl closed this as completed Jun 19, 2024
@arko-pl arko-pl reopened this Jun 19, 2024
@sfjro
Copy link
Owner

sfjro commented Jun 19, 2024 via email

@arko-pl
Copy link
Author

arko-pl commented Jun 19, 2024

Did you forget
mount -o bind /union /union/mnt/test
just before the last "remount"?

Sorry, my "enter" key got stuck, sent the message and closed the ticket... This is correct set of commands:

mkdir -p /mnt/image
mkdir /union
dd if=/dev/zero of=/tmp/image bs=1M count=8
mkfs.ext4 /tmp/image
mount -o loop /tmp/image /mnt/image
mount -t aufs -o br:/mnt/image=rw none /union
mkidr /union/usr
cp /bin/bash /union/usr/
mkdir -p /union/mnt/test
mount -o bind /union/usr /union/mnt/test
mount -o remount /union/mnt/test

mount -o move doesn't work it ends with "invalid argument".

There is one more thing to add. The aufs is mounted at the initrd stage where busybox is used. But I'm also reproducing on running system without busybox.

@sfjro
Copy link
Owner

sfjro commented Jun 20, 2024 via email

@arko-pl
Copy link
Author

arko-pl commented Jun 20, 2024

Great, thank you.

@sfjro
Copy link
Owner

sfjro commented Jun 20, 2024 via email

@arko-pl
Copy link
Author

arko-pl commented Jun 20, 2024

Would you test this patch?
I tested it and could not reproduce issue anymore. Thank you 👍

@sfjro
Copy link
Owner

sfjro commented Jun 22, 2024 via email

@arko-pl
Copy link
Author

arko-pl commented Jun 25, 2024

Great news. Thank you.

sfjro added a commit to sfjro/aufs-linux that referenced this issue Jun 30, 2024
If the subdir in aufs is "bind" mounted and remount is issued,
fs_context gives us the bound non-root dentry as if it were root.
But aufs expects fc->root being always root.

See-also: sfjro/aufs-standalone#40
Signed-off-by: J. R. Okajima <[email protected]>
@sfjro
Copy link
Owner

sfjro commented Jun 30, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants