canonical-ubuntu-qa team mailing list archive
-
canonical-ubuntu-qa team
-
Mailing list archive
-
Message #05327
[Bug 2078300] Re: ubuntu_zfs_stress triggers kernel BUG at mm/usercopy.c:99 on J-generic/lowlatancy-64k
It looks like this is specific to 64k kernels on Jammy. Other non-64k
Jammy 5.15 variant kernels and F-hwe-5.15 / F-lowlatency-hwe-5.15 64k
kernels are good as well.
--
You received this bug notification because you are a member of Canonical
Platform QA Team, which is subscribed to ubuntu-kernel-tests.
https://bugs.launchpad.net/bugs/2078300
Title:
ubuntu_zfs_stress triggers kernel BUG at mm/usercopy.c:99 on
J-generic/lowlatancy-64k
Status in ubuntu-kernel-tests:
New
Bug description:
Issue found with Jammy 5.15.0-121.131 generic-64k and lowlatency-64k
kernel on openstack ARM64 instance. (They are good with 5.15.0-118.128
last cycle)
The test failed because some process cannot be terminated properly and
consequently making the test not finishing properly (so the sut-test
failure was raised).
stress-ng: info: [571640] aio stressor will be skipped, it is not implemented on this system: aarch64 Linux 5.15.0-121-lowlatency-64k gcc 11.4.0 (built without aio.h)
stress-ng: info: [571640] setting to a 5 secs run per stressor
stress-ng: info: [571640] dispatching hogs: 4 hdd, 4 link, 4 symlink, 4 lockf, 4 seek, 4 dentry, 4 dir, 4 fallocate, 4 fstat, 1 io, 4 lease, 2 mmap, 4 open, 4 rename, 4 chdir, 4 chmod, 4 filename, 4 rename
stress-ng: info: [571640] note: /proc/sys/kernel/sched_autogroup_enabled is 1 and this can impact scheduling throughput for processes not attached to a tty. Setting this to 0 may improve performance metrics
stress-ng: info: [571681] io: this is a legacy I/O sync stressor, consider using iomix instead
stress-ng: info: [571688] open: using a maximum of 1048576 file descriptors
stress-ng: info: [571698] chdir: removing 8192 directories
stress-ng: warn: [571687] cannot terminate process 571939, gave up after 120 seconds
stress-ng: warn: [571686] cannot terminate process 571938, gave up after 120 seconds
sut-test TEST SYSTEM FAILURE DETECTED Test results file '/home/openstack/workspace/jammy-linux-lowlatency-lowlatency-64k-arm64-5.15.0-cpu2-ram4-disk20-ubuntu_zfs_stress/kernel-results.xml' not found.
But if you look into the console output, there is something wrong here, below is the output from 5.15.0-121-lowlatency-64k:
[ 840.682657] usercopy: Kernel memory exposure attempt detected from SLUB object 'zio_buf_comb_4096' (offset 1, size 8191)!
[ 840.685084] usercopy: Kernel memory exposure attempt detected from SLUB object 'zio_buf_comb_4096' (offset 1, size 8191)!
[ 840.687391] kernel BUG at mm/usercopy.c:99!
[ 840.688377] Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
[ 840.689705] Modules linked in: zfs(PO) zunicode(PO) zzstd(O) zlua(O) zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) binfmt_misc nls_iso8859_1 qemu_fw_cfg dm_multipath sch_fq_codel scsi_dh_rdac scsi_dh_emc scsi_dh_alua efi_pstore drm ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor xor_neon raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_ce ghash_ce sha2_ce sha256_arm64 sha1_ce virtio_net net_failover virtio_scsi failover aes_neon_bs aes_neon_blk aes_ce_blk crypto_simd cryptd aes_ce_cipher
[ 840.699464] CPU: 0 PID: 628890 Comm: stress-ng Tainted: P O 5.15.0-121-lowlatency-64k #131-Ubuntu
[ 840.701319] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[ 840.702626] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 840.704008] pc : usercopy_abort+0x98/0x9c
[ 840.704843] lr : usercopy_abort+0x98/0x9c
[ 840.705597] sp : ffff8000253cf910
[ 840.706216] x29: ffff8000253cf920 x28: 0000000000000000 x27: 0000000000010000
[ 840.707562] x26: ffff00002c282400 x25: ffff8000253cfbe0 x24: 0000000000000001
[ 840.708804] x23: 0000000000000000 x22: 0000000000001000 x21: 0000000000001fff
[ 840.710167] x20: ffff0000c001af00 x19: 0000000000000001 x18: 0000000000000000
[ 840.711505] x17: 656a626f2042554c x16: 53206d6f72662064 x15: 6574636574656420
[ 840.712845] x14: 74706d6574746120 x13: 2129313931382065 x12: 7a6973202c312074
[ 840.714286] x11: 657366666f282027 x10: 363930345f626d6f x9 : ffff8000082b5c48
[ 840.715615] x8 : 2079726f6d656d20 x7 : 0000000000000001 x6 : 0000000000000001
[ 840.716849] x5 : 0000000000000000 x4 : ffff0000ff9c2ac8 x3 : 0000000000000000
[ 840.718224] x2 : 0000000000000000 x1 : ffff0000db54fc00 x0 : 000000000000006d
[ 840.719546] Call trace:
[ 840.719976] usercopy_abort+0x98/0x9c
[ 840.720656] __check_heap_object+0x194/0x1d0
[ 840.721476] __check_object_size.part.0+0x160/0x1e0
[ 840.722414] __check_object_size+0x28/0x40
[ 840.723197] zfs_uiomove_iter+0x68/0x110 [zfs]
[ 840.724147] zfs_uiomove+0x40/0x60 [zfs]
[ 840.725136] dmu_read_uio_dnode+0xc8/0x120 [zfs]
[ 840.726141] dmu_read_uio_dbuf+0x58/0x80 [zfs]
[ 840.727096] mappedread+0xe8/0x150 [zfs]
[ 840.727955] zfs_read+0x164/0x350 [zfs]
[ 840.728783] zpl_iter_read+0xa4/0x12c [zfs]
[ 840.729640] new_sync_read+0xf0/0x184
[ 840.730320] vfs_read+0x15c/0x1f4
[ 840.730938] ksys_read+0x70/0x100
[ 840.731561] __arm64_sys_read+0x24/0x30
[ 840.732295] invoke_syscall+0x78/0x100
[ 840.732994] el0_svc_common.constprop.0+0x54/0x184
[ 840.733906] do_el0_svc+0x30/0xac
[ 840.734514] el0_svc+0x48/0x160
[ 840.735113] el0t_64_sync_handler+0xa4/0x12c
[ 840.735910] el0t_64_sync+0x1a4/0x1a8
[ 840.736617] Code: aa0003e3 90003020 910ce000 97fff353 (d4210000)
[ 840.737783] ---[ end trace d4861bf0f486b2ad ]---
[ 840.876737] ------------[ cut here ]------------
[ 840.880650] kernel BUG at mm/usercopy.c:99!
[ 840.982155] note: stress-ng[628890] exited with preempt_count 1
[ 840.982155] Internal error: Oops - BUG: 00000000f2000800 [#2] PREEMPT SMP
[ 840.982160] Modules linked in: zfs(PO) zunicode(PO) zzstd(O) zlua(O) zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) binfmt_misc nls_iso8859_1 qemu_fw_cfg dm_multipath sch_fq_codel scsi_dh_rdac scsi_dh_emc scsi_dh_alua efi_pstore drm ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor xor_neon raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_ce ghash_ce sha2_ce sha256_arm64 sha1_ce virtio_net net_failover virtio_scsi failover aes_neon_bs aes_neon_blk aes_ce_blk crypto_simd cryptd aes_ce_cipher
[ 840.995156] CPU: 1 PID: 628887 Comm: stress-ng Tainted: P D O 5.15.0-121-lowlatency-64k #131-Ubuntu
[ 840.997028] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[ 840.998545] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 840.999821] pc : usercopy_abort+0x98/0x9c
[ 841.000637] lr : usercopy_abort+0x98/0x9c
[ 841.001401] sp : ffff80002536f880
[ 841.002030] x29: ffff80002536f890 x28: 0000000000000000 x27: 0000000000010000
[ 841.003403] x26: ffff00002c280480 x25: ffff80002536fb50 x24: 0000000000000001
[ 841.004743] x23: 0000000000000000 x22: 0000000000001000 x21: 0000000000001fff
[ 841.006269] x20: ffff0000c001af00 x19: 0000000000000001 x18: 00000000a8cb9176
[ 841.007610] x17: ffff8000f5990000 x16: ffff800008020000 x15: ffff0000551aaf40
[ 841.008958] x14: ffff80000a968040 x13: ffff80000a967b28 x12: 0000000000000001
[ 841.010218] x11: 0000000000000004 x10: 0000000000001b30 x9 : ffff8000082b5c48
[ 841.011556] x8 : ffff0000d01fb510 x7 : ffff0000d3c78200 x6 : ffff0000d1d88ac8
[ 841.012906] x5 : 0000000000000000 x4 : ffff0000ffa62ac8 x3 : 0000000000000000
[ 841.014285] x2 : 0000000000000000 x1 : ffff0000d01f9980 x0 : 000000000000006d
[ 841.015609] Call trace:
[ 841.016085] usercopy_abort+0x98/0x9c
[ 841.016787] __check_heap_object+0x194/0x1d0
[ 841.017603] __check_object_size.part.0+0x160/0x1e0
[ 841.018535] __check_object_size+0x28/0x40
[ 841.019321] zfs_uiomove_iter+0x68/0x110 [zfs]
[ 841.020297] zfs_uiomove+0x40/0x60 [zfs]
[ 841.021196] dmu_read_uio_dnode+0xc8/0x120 [zfs]
[ 841.022182] dmu_read_uio_dbuf+0x58/0x80 [zfs]
[ 841.023054] mappedread+0xe8/0x150 [zfs]
[ 841.023893] zfs_read+0x164/0x350 [zfs]
[ 841.024747] zpl_iter_read+0xa4/0x12c [zfs]
[ 841.025637] new_sync_read+0xf0/0x184
[ 841.026329] vfs_read+0x15c/0x1f4
[ 841.026950] ksys_read+0x70/0x100
[ 841.027572] __arm64_sys_read+0x24/0x30
[ 841.028301] invoke_syscall+0x78/0x100
[ 841.029010] el0_svc_common.constprop.0+0x54/0x184
[ 841.029930] do_el0_svc+0x30/0xac
[ 841.030556] el0_svc+0x48/0x160
[ 841.031256] el0t_64_sync_handler+0xa4/0x12c
[ 841.032079] el0t_64_sync+0x1a4/0x1a8
[ 841.032833] Code: aa0003e3 90003020 910ce000 97fff353 (d4210000)
[ 841.033995] ---[ end trace d4861bf0f486b2ae ]---
[ 841.035385] note: stress-ng[628887] exited with preempt_count 1
[ 841.040774] ------------[ cut here ]------------
[ 841.041656] WARNING: CPU: 0 PID: 0 at kernel/rcu/tree.c:614 rcu_eqs_enter.constprop.0+0xa4/0xb0
[ 841.043373] Modules linked in: zfs(PO) zunicode(PO) zzstd(O) zlua(O) zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) binfmt_misc nls_iso8859_1 qemu_fw_cfg dm_multipath sch_fq_codel scsi_dh_rdac scsi_dh_emc scsi_dh_alua efi_pstore drm ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor xor_neon raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_ce ghash_ce sha2_ce sha256_arm64 sha1_ce virtio_net net_failover virtio_scsi failover aes_neon_bs aes_neon_blk aes_ce_blk crypto_simd cryptd aes_ce_cipher
[ 841.053419] CPU: 0 PID: 0 Comm: swapper/0 Tainted: P D O 5.15.0-121-lowlatency-64k #131-Ubuntu
[ 841.055253] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[ 841.056585] pstate: 204000c5 (nzCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 841.057965] pc : rcu_eqs_enter.constprop.0+0xa4/0xb0
[ 841.058899] lr : rcu_idle_enter+0x18/0x24
[ 841.059659] sp : ffff80000a90fd30
[ 841.060308] x29: ffff80000a90fd30 x28: 000000012f320018 x27: 000000013b530aa0
[ 841.061618] x26: 000000013b530a20 x25: 000000013b530960 x24: 000000013f998528
[ 841.062927] x23: 0000000000060000 x22: ffff80000a933900 x21: ffff80000a933900
[ 841.064316] x20: 0000000000000000 x19: ffff0000ffa4d700 x18: 00000000fd3f2d21
[ 841.065793] x17: ffffffffffffffff x16: ffffffffffffffff x15: 7b1f040030771f04
[ 841.067147] x14: ffff80000a968040 x13: ffff80000a967b28 x12: 0000000000000000
[ 841.068507] x11: 000000000000000c x10: 0000000000001b30 x9 : ffff8000090b2740
[ 841.069872] x8 : ffff80000a935490 x7 : 0000000000014000 x6 : ffff0000ff9c3628
[ 841.071205] x5 : 00000000410fd0c0 x4 : ffff8000f58f0000 x3 : 0000000000000000
[ 841.072532] x2 : 0000000000000000 x1 : 4000000000000002 x0 : 4000000000000000
[ 841.073946] Call trace:
[ 841.074415] rcu_eqs_enter.constprop.0+0xa4/0xb0
[ 841.075292] rcu_idle_enter+0x18/0x24
[ 841.075991] default_idle_call+0x40/0x1ac
[ 841.076729] cpuidle_idle_call+0x174/0x200
[ 841.077478] do_idle+0xac/0x100
[ 841.078072] cpu_startup_entry+0x30/0x6c
[ 841.078822] rest_init+0x104/0x130
[ 841.079468] arch_call_rest_init+0x18/0x24
[ 841.080256] start_kernel+0x4b4/0x4ec
[ 841.080957] __primary_switched+0xbc/0xc4
[ 841.081773] ---[ end trace d4861bf0f486b2af ]---
We only catch this issue now because:
* The test result for generic-64k kernel was lost during the infrastructure transition, it was found after rebuilding the test on the new jenkins.
* This kernel BUG error message didn't came up with the 1st attempt on lowlatency-64k kernel.
* The tool in CKCT is not scanning for "kernel BUG" pattern.
Please find attachment for the complete console log retrieved for
J-generic-64k.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/2078300/+subscriptions
References