canonical-ubuntu-qa team mailing list archive
-
canonical-ubuntu-qa team
-
Mailing list archive
-
Message #04236
[Bug 1998738] Re: dev test from ubuntu_stress_smoke_tests cause kernel oops on F-5.4 xilinx ZCU106
** Changed in: linux-xilinx-zynqmp (Ubuntu)
Status: New => Invalid
--
You received this bug notification because you are a member of Canonical
Platform QA Team, which is subscribed to ubuntu-kernel-tests.
https://bugs.launchpad.net/bugs/1998738
Title:
dev test from ubuntu_stress_smoke_tests cause kernel oops on F-5.4
xilinx ZCU106
Status in ubuntu-kernel-tests:
New
Status in linux-xilinx-zynqmp package in Ubuntu:
Invalid
Status in linux-xilinx-zynqmp source package in Focal:
New
Bug description:
This issue can only be reproduced on ZCU106, it will cause some
leftover processes running and eventually cause the jenkins job hang.
stress-ng with commit 91ec6bccd7 (V0.15.00)
stress-ng: invoked with './stress-ng -v -t 5 --dev 4 --dev-ops 3000 --ignite-cpu --syslog --verbose --verify --oomable' by user 0 'root'
stress-ng: system: '202008-28164-ZCU106' Linux 5.4.0-1019-xilinx-zynqmp #22-Ubuntu SMP Thu Nov 17 05:04:22 UTC 2022 aarch64
stress-ng: memory (MB): total 3929.76, free 2479.07, shared 4.30, buffer 59.98, swap 0.00, free swap 0.00
stress-ng: info: [3037] setting to a 5 second run per stressor
stress-ng: info: [3037] dispatching hogs: 4 dev
kernel: [ 981.702313] xilinx-multiscaler a00e0000.v_multi: Channel 0 instance created
kernel: [ 981.702829] xilinx-multiscaler a00e0000.v_multi: Channel 0 instance released
kernel: [ 981.708039] xilinx-multiscaler a00e0000.v_multi: Channel 0 instance created
kernel: [ 981.708569] xilinx-multiscaler a00e0000.v_multi: Channel 0 instance released
kernel: [ 981.709027] xilinx-multiscaler a00e0000.v_multi: Channel 0 instance created
kernel: [ 981.709501] xilinx-multiscaler a00e0000.v_multi: Channel 0 instance released
kernel: [ 981.734320] xilinx-multiscaler a00e0000.v_multi: Channel 0 instance created
kernel: [ 981.734859] xilinx-multiscaler a00e0000.v_multi: Channel 0 instance released
Message from syslogd@202008-28164-ZCU106 at Dec 5 05:11:01 ...
kernel:[ 981.797006] Internal error: Oops: 96000004 [#1] SMP
kernel: [ 981.768878] xilinx-multiscaler a00e0000.v_multi: Channel 0 instance created
kernel: [ 981.768958] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1
kernel: [ 981.768961] Unable to handle kernel access to user memory outside uaccess routines at virtual address 0000087000000f48
kernel: [ 981.768966] Mem abort info:
kernel: [ 981.779704] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1
kernel: [ 981.782475] ESR = 0x96000004
kernel: [ 981.782478] EC = 0x25: DABT (current EL), IL = 32 bits
kernel: [ 981.782480] SET = 0, FnV = 0
kernel: [ 981.782484] EA = 0, S1PTW = 0
kernel: [ 981.785524] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1
kernel: [ 981.790822] Data abort info:
kernel: [ 981.790824] ISV = 0, ISS = 0x00000004
kernel: [ 981.790826] CM = 0, WnR = 0
kernel: [ 981.790830] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000838768000
kernel: [ 981.790833] [0000087000000f48] pgd=0000000000000000
kernel: [ 981.793875] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1
kernel: [ 981.797006] Internal error: Oops: 96000004 [#1] SMP
kernel: [ 981.797010] Modules linked in: xt_conntrack ipt_REJECT nf_reject_ipv4 ip6table_nat xt_CHECKSUM iptable_nat xt_MASQUERADE nf_nat iptable_filter fuse dm_multipath dm_mod al5e al5d allegro xlnx_vcu_clk xlnx_vcu xilinx_hdmi_tx xilinx_hdmi_rx xlnx_vcu_core dp159 xilinx_vphy lm63 ina2xx_adc mali dmaproxy nfsd zocl
kernel: [ 981.805628] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1
kernel: [ 981.808485] CPU: 1 PID: 3044 Comm: stress-ng-dev Not tainted 5.4.0-1019-xilinx-zynqmp #22-Ubuntu
kernel: [ 981.808487] Hardware name: ZynqMP ZCU106 RevA (DT)
kernel: [ 981.808491] pstate: 00400005 (nzcv daif +PAN -UAO)
kernel: [ 981.812321] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1
kernel: [ 981.815269] pc : __mutex_lock.isra.0+0x170/0x510
kernel: [ 981.815273] lr : __mutex_lock_slowpath+0x28/0x38
kernel: [ 981.815276] sp : ffff800017c3bb30
kernel: [ 981.821772] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1
kernel: [ 981.826563] x29: ffff800017c3bb30 x28: ffff00083460ec00
kernel: [ 981.826567] x27: 0000ffffb3f2f000 x26: ffff000855fda500
kernel: [ 981.826571] x25: 0000000000000000 x24: ffff0008498fd400
kernel: [ 981.826574] x23: 0000000000000031 x22: ffff000875878750
kernel: [ 981.826578] x21: 0000000000000002 x20: ffff0008385d4e40
kernel: [ 981.835222] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1
kernel: [ 981.840035] x19: ffff0008758787f0 x18: 0000000000000000
kernel: [ 981.840039] x17: 0000000000000000 x16: 0000000000000000
kernel: [ 981.840042] x15: 0000000000000000 x14: 0000000000000000
kernel: [ 981.840046] x13: 0000000000000000 x12: 0000000000000000
kernel: [ 981.868428] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1
kernel: [ 981.875905] x11: 0000000000000000 x10: 0000000000100000
kernel: [ 981.875909] x9 : 00000000000000fb x8 : 0000000010044400
kernel: [ 981.875912] x7 : 0000000000000000 x6 : ffff00083460e0c0
kernel: [ 981.875915] x5 : 0000000000000015 x4 : 0000000000000014
kernel: [ 981.875919] x3 : 0000087000000f00 x2 : ffff0008385d4e40
kernel: [ 981.875922] x1 : 0000087000000f00 x0 : 0000087000000f00
kernel: [ 981.875926] Call trace:
kernel: [ 981.875933] __mutex_lock.isra.0+0x170/0x510
kernel: [ 981.875939] __mutex_lock_slowpath+0x28/0x38
kernel: [ 981.885784] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1
kernel: [ 981.889485] mutex_lock+0x48/0x58
kernel: [ 981.889491] xm2msc_mmap+0x38/0x68
kernel: [ 981.889497] v4l2_mmap+0x7c/0xb8
kernel: [ 981.889504] mmap_region+0x364/0x5b0
kernel: [ 981.889511] do_mmap+0x294/0x478
kernel: [ 981.894358] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1
kernel: [ 981.902880] vm_mmap_pgoff+0xf4/0x120
kernel: [ 981.902885] ksys_mmap_pgoff+0x1ac/0x240
kernel: [ 981.902891] __arm64_sys_mmap+0x38/0x50
kernel: [ 981.902897] el0_svc_common.constprop.0+0x78/0x180
kernel: [ 981.902903] el0_svc_handler+0x84/0xa0
Message from syslogd@202008-28164-ZCU106 at Dec 5 05:11:01 ...
kernel:[ 981.912115] Code: a94153f3 a9425bf5 a8c97bfd d65f03c0 (b9404801)
kernel: [ 981.907665] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1
kernel: [ 981.912107] el0_svc+0x8/0x1c0
kernel: [ 981.912115] Code: a94153f3 a9425bf5 a8c97bfd d65f03c0 (b9404801)
kernel: [ 981.912121] ---[ end trace bab66edb32cbb4db ]---
Here is the output when running this test:
$ time sudo ./stress-ng -v -t 5 --dev 4 --dev-ops 3000 --ignite-cpu --syslog --verbose --verify --oomable
stress-ng: debug: [3037] invoked with './stress-ng -v -t 5 --dev 4 --dev-ops 3000 --ignite-cpu --syslog --verbose --verify --oomable' by user 0 'root'
stress-ng: debug: [3037] stress-ng 0.15.00 g91ec6bccd7e9
stress-ng: debug: [3037] system: Linux 202008-28164-ZCU106 5.4.0-1019-xilinx-zynqmp #22-Ubuntu SMP Thu Nov 17 05:04:22 UTC 2022 aarch64
stress-ng: debug: [3037] RAM total: 3.8G, RAM free: 2.4G, swap free: 0.0
stress-ng: debug: [3037] temporary file path: '.', filesystem type: ext2
stress-ng: debug: [3037] 4 processors online, 4 processors configured
stress-ng: info: [3037] setting to a 5 second run per stressor
stress-ng: info: [3037] dispatching hogs: 4 dev
stress-ng: debug: [3037] cache allocate: using defaults, cannot determine cache level details
stress-ng: debug: [3037] cache allocate: shared cache buffer size: 2048K
stress-ng: debug: [3037] starting stressors
stress-ng: debug: [3039] dev: started [3039] (instance 0)
stress-ng: debug: [3040] dev: started [3040] (instance 1)
stress-ng: debug: [3037] 4 stressors started
stress-ng: debug: [3041] dev: started [3041] (instance 2)
stress-ng: debug: [3042] dev: started [3042] (instance 3)
Message from syslogd@202008-28164-ZCU106 at Dec 5 05:11:01 ...
kernel:[ 981.797006] Internal error: Oops: 96000004 [#1] SMP
Message from syslogd@202008-28164-ZCU106 at Dec 5 05:11:01 ...
kernel:[ 981.912115] Code: a94153f3 a9425bf5 a8c97bfd d65f03c0 (b9404801)
stress-ng: debug: [3042] dev: exited [3042] (instance 3)
stress-ng: debug: [3041] dev: exited [3041] (instance 2)
stress-ng: info: [3039] dev: 19 of 383 devices opened and exercised
stress-ng: debug: [3039] dev: exited [3039] (instance 0)
stress-ng: debug: [3037] process [3039] terminated
(hung here)
You can see process 3040 did not exit here.
strace output:
$ sudo strace -p 3040
strace: Process 3040 attached
wait4(3044, 0xffffda2c3214, 0, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
getpid() = 3040
setitimer(ITIMER_REAL, {it_interval={tv_sec=0, tv_usec=0}, it_value={tv_sec=1, tv_usec=0}}, {it_interval={tv_sec=0, tv_usec=0}, it_value={tv_sec=0, tv_usec=0}}) = 0
rt_sigreturn({mask=[]}) = -1 EINTR (Interrupted system call)
kill(3044, SIGALRM) = 0
kill(3044, SIGKILL) = 0
clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=1, tv_nsec=0}, {tv_sec=0, tv_nsec=989179}) = ? ERESTART_RESTARTBLOCK (Interrupted by signal)
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
getpid() = 3040
setitimer(ITIMER_REAL, {it_interval={tv_sec=0, tv_usec=0}, it_value={tv_sec=1, tv_usec=0}}, {it_interval={tv_sec=0, tv_usec=0}, it_value={tv_sec=0, tv_usec=0}}) = 0
rt_sigreturn({mask=[]}) = -1 EINTR (Interrupted system call)
wait4(3044, 0xffffda2c3214, 0, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
getpid() = 3040
setitimer(ITIMER_REAL, {it_interval={tv_sec=0, tv_usec=0}, it_value={tv_sec=1, tv_usec=0}}, {it_interval={tv_sec=0, tv_usec=0}, it_value={tv_sec=0, tv_usec=0}}) = 0
rt_sigreturn({mask=[]}) = -1 EINTR (Interrupted system call)
kill(3044, SIGALRM) = 0
kill(3044, SIGKILL) = 0
clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=1, tv_nsec=0}, {tv_sec=0, tv_nsec=505466}) = ? ERESTART_RESTARTBLOCK (Interrupted by signal)
(repeats)
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1998738/+subscriptions