canonical-ubuntu-qa team mailing list archive
-
canonical-ubuntu-qa team
-
Mailing list archive
-
Message #04818
[Bug 2072972] Re: ftrace:test.d--ftrace--func_traceonoff_triggers.tc in ubuntu_kselftests_ftrace triggers kernel NULL pointer dereference on node blanka
There is not much delta between -181 and -182, and I successfully
triggered this on 5.4.0-181 + 181 source code. So the bisect result in
#1 is invalid.
As this test fail rate is not 100% reproducible, I am going to run it
for multiple times.
I can reproduce this all the way back to 5.4.0-170-generic.
And even 5.4.0-150 is failing too, the trace is a bit different though:
[ 3706.368639] BUG: kernel NULL pointer dereference, address: 0000000000000050
[ 3706.376413] #PF: supervisor read access in kernel mode
[ 3706.382146] #PF: error_code(0x0000) - not-present page
[ 3706.387878] PGD 0 P4D 0
[ 3706.390704] Oops: 0000 [#1] SMP NOPTI
[ 3706.394789] CPU: 255 PID: 0 Comm: swapper/255 Tainted: G OE 5.4.0-150-generic #167-Ubuntu
[ 3706.405274] Hardware name: NVIDIA DGXA100 920-23687-2530-000/DGXA100, BIOS 1.25 08/31/2023
[ 3706.414504] RIP: 0010:trace_event_raw_event_wbt_timer+0x6f/0x100
[ 3706.421206] Code: 59 80 e5 02 0f 85 8f 00 00 00 4c 89 e6 ba 34 00 00 00 48 8d 7d a0 e8 50 cb c9 ff 49 89 c4 48 85 c0 74 37 49 8b 87 b8 03 00 00 <48> 8b 70 50 48 85 f6 74 45 49 8d 7c 24 08 ba 20 00 00 00 e8 99 4b
[ 3706.442162] RSP: 0018:ffffa29e5cad8da0 EFLAGS: 00010286
[ 3706.447993] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000080000100
[ 3706.455956] RDX: ffff8eaf36ab00ac RSI: 0000000000000100 RDI: ffff8eaf36ab00a8
[ 3706.463918] RBP: ffffa29e5cad8e08 R08: ffff8eaf36ab00a8 R09: 0000000000000100
[ 3706.471878] R10: 0000000000000004 R11: 0000000000000004 R12: ffff8eaf36ab00ac
[ 3706.479840] R13: 0000000000000000 R14: 0000000000000003 R15: ffff8e0fc0048800
[ 3706.487804] FS: 0000000000000000(0000) GS:ffff8eaf7fbc0000(0000) knlGS:0000000000000000
[ 3706.496834] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3706.503243] CR2: 0000000000000050 CR3: 000000bd7e40a000 CR4: 0000000000340ee0
[ 3706.511204] Call Trace:
[ 3706.513932] <IRQ>
[ 3706.516177] wb_timer_fn+0x1d6/0x3c0
[ 3706.520169] ? blk_stat_free_callback_rcu+0x30/0x30
[ 3706.525610] blk_stat_timer_fn+0x134/0x140
[ 3706.530183] call_timer_fn+0x32/0x130
[ 3706.534261] __run_timers.part.0+0x180/0x280
[ 3706.539026] ? trace_event_raw_event_softirq+0x5d/0xa0
[ 3706.544759] run_timer_softirq+0x2a/0x50
[ 3706.549137] __do_softirq+0xd1/0x2c1
[ 3706.553125] irq_exit+0xae/0xb0
[ 3706.556629] smp_apic_timer_interrupt+0x7b/0x140
[ 3706.561780] apic_timer_interrupt+0xf/0x20
[ 3706.566348] </IRQ>
[ 3706.568689] RIP: 0010:native_safe_halt+0xe/0x10
[ 3706.573741] Code: 7b ff ff ff eb bd 90 90 90 90 90 90 e9 07 00 00 00 0f 00 2d f6 14 51 00 f4 c3 66 90 e9 07 00 00 00 0f 00 2d e6 14 51 00 fb f4 <c3> 90 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 e8 1d 3b 62 ff 65
[ 3706.594697] RSP: 0018:ffffa29e596ebe70 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
[ 3706.603145] RAX: 0000000000023800 RBX: ffff8e2f5d0a3768 RCX: 00000000000416f2
[ 3706.611108] RDX: 00000000000416f2 RSI: 0000000000000000 RDI: ffffffffb1ec5f60
[ 3706.619070] RBP: ffffa29e596ebe90 R08: 0000000000000000 R09: 0000000000000000
[ 3706.627031] R10: 0000000000000000 R11: 0000000000000004 R12: 00000000000000ff
[ 3706.634993] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 3706.642957] ? default_idle+0x20/0x140
[ 3706.647140] arch_cpu_idle+0x15/0x20
[ 3706.651127] default_idle_call+0x23/0x30
[ 3706.655506] do_idle+0x1fb/0x270
[ 3706.659108] ? complete+0x49/0x50
[ 3706.662805] cpu_startup_entry+0x20/0x30
[ 3706.667181] start_secondary+0x173/0x1d0
[ 3706.671560] secondary_startup_64+0xa4/0xb0
[ 3706.676217] Modules linked in: nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua kvm_amd kvm ipmi_ssif input_leds binfmt_misc mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) ccp ipmi_si ipmi_devintf ipmi_msghandler sch_fq_codel msr efi_pstore ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic crct10dif_pclmul ast crc32_pclmul drm_vram_helper ghash_clmulni_intel ttm mlx5_core(OE) aesni_intel pci_hyperv_intf crypto_simd drm_kms_helper mlxdevm(OE) cryptd glue_helper syscopyarea auxiliary(OE) sysfillrect mpt3sas igb sysimgblt mlx_compat(OE) tls raid_class fb_sys_fops dca uas usbhid i2c_algo_bit scsi_transport_sas usb_storage hid drm mlxfw(OE) nvme psample nvme_core
[ 3706.755113] CR2: 0000000000000050
[ 3706.758930] ---[ end trace d6371d7677eb0b6d ]---
[ 3706.863060] RIP: 0010:trace_event_raw_event_wbt_timer+0x6f/0x100
[ 3706.869763] Code: 59 80 e5 02 0f 85 8f 00 00 00 4c 89 e6 ba 34 00 00 00 48 8d 7d a0 e8 50 cb c9 ff 49 89 c4 48 85 c0 74 37 49 8b 87 b8 03 00 00 <48> 8b 70 50 48 85 f6 74 45 49 8d 7c 24 08 ba 20 00 00 00 e8 99 4b
[ 3706.890717] RSP: 0018:ffffa29e5cad8da0 EFLAGS: 00010286
[ 3706.896545] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000080000100
[ 3706.904505] RDX: ffff8eaf36ab00ac RSI: 0000000000000100 RDI: ffff8eaf36ab00a8
[ 3706.912466] RBP: ffffa29e5cad8e08 R08: ffff8eaf36ab00a8 R09: 0000000000000100
[ 3706.920428] R10: 0000000000000004 R11: 0000000000000004 R12: ffff8eaf36ab00ac
[ 3706.928388] R13: 0000000000000000 R14: 0000000000000003 R15: ffff8e0fc0048800
[ 3706.936352] FS: 0000000000000000(0000) GS:ffff8eaf7fbc0000(0000) knlGS:0000000000000000
[ 3706.945380] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3706.951789] CR2: 0000000000000050 CR3: 000000bd7e40a000 CR4: 0000000000340ee0
[ 3706.959751] Kernel panic - not syncing: Fatal exception in interrupt
[ 3706.967412] Kernel Offset: 0x2f400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 3707.430969] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
--
You received this bug notification because you are a member of Canonical
Platform QA Team, which is subscribed to ubuntu-kernel-tests.
https://bugs.launchpad.net/bugs/2072972
Title:
ftrace:test.d--ftrace--func_traceonoff_triggers.tc in
ubuntu_kselftests_ftrace triggers kernel NULL pointer dereference on
node blanka
Status in ubuntu-kernel-tests:
New
Bug description:
Issue found on node "blanka" with Focal 5.4.0-190.210 in s2024.06.10
After some manual tests I noticed that this issue is not 100% reproducible:
* Focal 5.4.0-189.209 + -189 source failed with 5 out of 5 attempts
* Focal 5.4.0-190.210 + -190 source failed with 2 out of 5 attempts
* Focal 5.4.0-192.212 + -192 source failed with 3 out of 5 attempts
Despite with this high fail rate, I can't see this failure in our test
history all the way back to 2024.01.08 (except with 5.4.0-190.210),
perhaps we retest it and it has passed?
$ sudo ./ftracetest -v test.d/ftrace/func_traceonoff_triggers.tc
=== Ftrace unit tests ===
[1] ftrace - test for function traceon/off triggers
dmesg output:
[ 7112.186092] Scheduler tracepoints stat_sleep, stat_iowait, stat_blocked and stat_runtime require the kernel parameter schedstats=enable or kernel.sched_schedstats=1
[ 7128.566260] BUG: kernel NULL pointer dereference, address: 0000000000000050
[ 7128.574031] #PF: supervisor read access in kernel mode
[ 7128.579763] #PF: error_code(0x0000) - not-present page
[ 7128.585495] PGD 0 P4D 0
[ 7128.588320] Oops: 0000 [#1] SMP NOPTI
[ 7128.592405] CPU: 129 PID: 0 Comm: swapper/129 Tainted: G OE 5.4.0-190-generic #210-Ubuntu
[ 7128.602887] Hardware name: NVIDIA DGXA100 920-23687-2530-000/DGXA100, BIOS 1.25 08/31/2023
[ 7128.612119] RIP: 0010:trace_event_raw_event_wbt_timer+0x6f/0x100
[ 7128.618820] Code: 59 80 e5 02 0f 85 8f 00 00 00 4c 89 e6 ba 34 00 00 00 48 8d 7d a0 e8 b0 ab c9 ff 49 89 c4 48 85 c0 74 37 49 8b 87 b8 03 00 00 <48> 8b 70 50 48 85 f6 74 45 49 8d 7c 24 08 ba 20 00 00 00 e8 c9 12
[ 7128.639774] RSP: 0018:ffffb1779b140da0 EFLAGS: 00010282
[ 7128.645604] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000080000100
[ 7128.653565] RDX: ffff8fe7ae7052f8 RSI: 0000000000000100 RDI: ffff8fe7ae7052f4
[ 7128.661525] RBP: ffffb1779b140e08 R08: ffff8fe7ae7052f4 R09: 0000000000000100
[ 7128.669486] R10: ffffd1777fdcb960 R11: 0000000000000000 R12: ffff8fe7ae7052f8
[ 7128.677445] R13: 00000000ffffffff R14: 0000000000000002 R15: ffff9047fe8f5000
[ 7128.685409] FS: 0000000000000000(0000) GS:ffff8fe80f840000(0000) knlGS:0000000000000000
[ 7128.694437] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7128.700847] CR2: 0000000000000050 CR3: 000000cb2760a000 CR4: 0000000000340ee0
[ 7128.708809] Call Trace:
[ 7128.711535] <IRQ>
[ 7128.713782] ? show_regs.cold+0x1a/0x1f
[ 7128.718058] ? __die+0x90/0xd9
[ 7128.721467] ? no_context+0x196/0x380
[ 7128.725555] ? ring_buffer_lock_reserve+0x165/0x3d0
[ 7128.730996] ? ring_buffer_unlock_commit+0x27/0x130
[ 7128.736436] ? __bad_area_nosemaphore+0x50/0x1a0
[ 7128.741585] ? bad_area_nosemaphore+0x16/0x20
[ 7128.746445] ? do_user_addr_fault+0x267/0x450
[ 7128.751306] ? trace_event_raw_event_x86_exceptions+0x80/0xd0
[ 7128.757718] ? __do_page_fault+0x58/0x90
[ 7128.762092] ? do_page_fault+0x2c/0xe0
[ 7128.766276] ? page_fault+0x34/0x40
[ 7128.770167] ? trace_event_raw_event_wbt_timer+0x6f/0x100
[ 7128.776190] wb_timer_fn+0x1d6/0x3c0
[ 7128.780179] ? blk_mq_tag_update_depth+0x100/0x100
[ 7128.785522] blk_stat_timer_fn+0x13a/0x140
[ 7128.790094] call_timer_fn+0x32/0x130
[ 7128.794179] __run_timers.part.0+0x180/0x280
[ 7128.798945] ? trace_event_raw_event_softirq+0x5d/0xa0
[ 7128.804677] run_timer_softirq+0x2a/0x50
[ 7128.809053] __do_softirq+0xd1/0x2c1
[ 7128.813041] irq_exit+0xae/0xb0
[ 7128.816545] smp_apic_timer_interrupt+0x7b/0x140
[ 7128.821696] apic_timer_interrupt+0xf/0x20
[ 7128.826264] </IRQ>
[ 7128.828603] RIP: 0010:native_safe_halt+0xe/0x10
[ 7128.833655] Code: 7b ff ff ff eb bd 90 90 90 90 90 90 e9 07 00 00 00 0f 00 2d a6 14 50 00 f4 c3 66 90 e9 07 00 00 00 0f 00 2d 96 14 50 00 fb f4 <c3> 90 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 e8 ed 46 61 ff 65
[ 7128.854607] RSP: 0018:ffffb177992fbe70 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
[ 7128.863053] RAX: 0000000000023800 RBX: ffff90676cadbf28 RCX: 000000000003514a
[ 7128.871013] RDX: 000000000003514a RSI: 0000000000000000 RDI: ffffffffb06c6120
[ 7128.878974] RBP: ffffb177992fbe90 R08: 00000000000001ac R09: 0000000000000000
[ 7128.886936] R10: 0000000000020000 R11: 0000000000000002 R12: 0000000000000081
[ 7128.894895] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 7128.902858] ? default_idle+0x20/0x140
[ 7128.907040] arch_cpu_idle+0x15/0x20
[ 7128.911027] default_idle_call+0x23/0x30
[ 7128.915403] do_idle+0x1fb/0x270
[ 7128.919004] ? complete+0x49/0x50
[ 7128.922699] cpu_startup_entry+0x20/0x30
[ 7128.927074] start_secondary+0x178/0x1d0
[ 7128.931452] secondary_startup_64+0xa4/0xb0
[ 7128.936116] Modules linked in: nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua amd64_edac_mod edac_mce_amd kvm_amd kvm ipmi_ssif input_leds binfmt_misc mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) ccp k10temp ipmi_si ipmi_devintf ipmi_msghandler mac_hid sch_fq_codel msr ramoops reed_solomon efi_pstore ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear ses enclosure ast crct10dif_pclmul crc32_pclmul drm_vram_helper ghash_clmulni_intel mlx5_core(OE) ttm aesni_intel crypto_simd drm_kms_helper pci_hyperv_intf mlxdevm(OE) cryptd syscopyarea sysfillrect auxiliary(OE) glue_helper igb tls sysimgblt uas hid_generic mpt3sas mlxfw(OE) psample dca raid_class usbhid fb_sys_fops scsi_transport_sas nvme i2c_algo_bit usb_storage hid drm mlx_compat(OE) nvme_core i2c_piix4
[ 7129.023659] CR2: 0000000000000050
[ 7129.027471] ---[ end trace 5d27e00102fa9701 ]---
[ 7129.052391] RIP: 0010:trace_event_raw_event_wbt_timer+0x6f/0x100
[ 7129.059093] Code: 59 80 e5 02 0f 85 8f 00 00 00 4c 89 e6 ba 34 00 00 00 48 8d 7d a0 e8 b0 ab c9 ff 49 89 c4 48 85 c0 74 37 49 8b 87 b8 03 00 00 <48> 8b 70 50 48 85 f6 74 45 49 8d 7c 24 08 ba 20 00 00 00 e8 c9 12
[ 7129.080046] RSP: 0018:ffffb1779b140da0 EFLAGS: 00010282
[ 7129.085875] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000080000100
[ 7129.093835] RDX: ffff8fe7ae7052f8 RSI: 0000000000000100 RDI: ffff8fe7ae7052f4
[ 7129.101795] RBP: ffffb1779b140e08 R08: ffff8fe7ae7052f4 R09: 0000000000000100
[ 7129.109757] R10: ffffd1777fdcb960 R11: 0000000000000000 R12: ffff8fe7ae7052f8
[ 7129.117718] R13: 00000000ffffffff R14: 0000000000000002 R15: ffff9047fe8f5000
[ 7129.125680] FS: 0000000000000000(0000) GS:ffff8fe80f840000(0000) knlGS:0000000000000000
[ 7129.134706] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7129.141115] CR2: 0000000000000050 CR3: 000000cb2760a000 CR4: 0000000000340ee0
[ 7129.149075] Kernel panic - not syncing: Fatal exception in interrupt
[ 7129.158266] Kernel Offset: 0x2dc00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 7129.193426] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/2072972/+subscriptions
References