kernel-packages team mailing list archive
-
kernel-packages team
-
Mailing list archive
-
Message #85431
[Bug 1370421] Missing required logs.
This bug is missing log files that will aid in diagnosing the problem.
>From a terminal window please run:
apport-collect 1370421
and then change the status of the bug to 'Confirmed'.
If, due to the nature of the issue you have encountered, you are unable
to run this command, please add a comment stating that fact and change
the bug status to 'Confirmed'.
This change has been made by an automated script, maintained by the
Ubuntu Kernel Team.
** Changed in: linux (Ubuntu)
Status: New => Incomplete
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1370421
Title:
BUG: soft lockup - CPU#15 stuck for 59737s! [genload:22734]
Status in “linux” package in Ubuntu:
Incomplete
Bug description:
== Comment: #0 - ABDUL HALEEM <abdhalee@xxxxxxxxxx> - 2014-09-01 05:24:37 ==
---Problem Description---
CPU stalls and soft lockup on cpu while running ltpstresstest.sh test of LTP suite, detailed syslog and the test logs are attached
Contact Information = abdhalee@xxxxxxxxxx
---uname output---
Linux ubuntu 3.16.0-10-generic #15-Ubuntu SMP Thu Aug 21 16:32:31 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux
Machine Type = POWER8
---Debugger---
A debugger is not configured
---Steps to Reproduce---
- Ubuntu 14.10 LE guest running on Power 8 machine with Power KVM build 2_1_1.8
- Download and build LTP suite on the guest. run /opt/ltp/testscripts/ltpstress.sh -d /tmp/sardata -l /tmp/ltplog.12028 -m 128 -t 24 -S
- After 2hrs of test run, dmesg start throwing below trace messages.
syslog:
---------
Aug 31 09:31:59 ubuntu kernel: [83796.274731] Adding 576k swap on swapfile29. Priority:-29 extents:1 across:576k FS
Aug 31 09:32:00 ubuntu in.rshd[8457]: connect from 127.0.0.1 (127.0.0.1)
Aug 31 09:32:01 ubuntu in.rshd[8459]: connect from 127.0.0.1 (127.0.0.1)
Aug 31 09:32:02 ubuntu in.rshd[8461]: connect from 127.0.0.1 (127.0.0.1)
Sep 1 04:42:36 ubuntu kernel: [147953.248523] INFO: rcu_sched detected stalls on CPUs/tasks: { 15} (detected by 2, t=92214 jiffies, g=440674, c=440673, q=304)
Sep 1 04:42:36 ubuntu kernel: [147953.248720] Task dump for CPU 15:
Sep 1 04:42:36 ubuntu kernel: [147953.248725] genload R running task 0 22734 22733 0x00040000
Sep 1 04:42:36 ubuntu kernel: [147953.248730] Call Trace:
Sep 1 04:42:36 ubuntu kernel: [147953.248740] [c0000000033239b0] [c000000000056fe4] ht64_call_hpte_insert1+0x4/0x3c (unreliable)
Sep 1 04:42:36 ubuntu kernel: [147953.248745] [c000000003323ab0] [c0000000000532c8] hash_preload+0x2f8/0x300
Sep 1 04:42:36 ubuntu kernel: [147953.248748] [c000000003323b30] [c00000000004eaf0] update_mmu_cache+0xf0/0x110
Sep 1 04:42:36 ubuntu kernel: [147953.248753] [c000000003323b70] [c00000000023559c] handle_mm_fault+0xa0c/0x11b0
Sep 1 04:42:36 ubuntu kernel: [147953.248758] [c000000003323c10] [c0000000009e58dc] do_page_fault+0x71c/0x990
Sep 1 04:42:36 ubuntu kernel: [147953.248762] [c000000003323e30] [c000000000009568] handle_page_fault+0x10/0x30
Sep 1 04:42:36 ubuntu kernel: [147953.250365] INFO: rcu_sched detected stalls on CPUs/tasks: { 15} (detected by 2, t=16035133 jiffies, g=440674, c=440673, q=304)
Sep 1 04:42:36 ubuntu kernel: [147953.250519] Task dump for CPU 15:
Sep 1 04:42:36 ubuntu kernel: [147953.250522] genload R running task 0 22734 22733 0x00040000
Sep 1 04:42:36 ubuntu kernel: [147953.250525] Call Trace:
Sep 1 04:42:36 ubuntu kernel: [147953.250528] [c0000000033239b0] [c000000000056fe4] ht64_call_hpte_insert1+0x4/0x3c (unreliable)
Sep 1 04:42:36 ubuntu kernel: [147953.250532] [c000000003323ab0] [c0000000000532c8] hash_preload+0x2f8/0x300
Sep 1 04:42:36 ubuntu kernel: [147953.250535] [c000000003323b30] [c00000000004eaf0] update_mmu_cache+0xf0/0x110
Sep 1 04:42:36 ubuntu kernel: [147953.250538] [c000000003323b70] [c00000000023559c] handle_mm_fault+0xa0c/0x11b0
Sep 1 04:42:36 ubuntu kernel: [147953.250541] [c000000003323c10] [c0000000009e58dc] do_page_fault+0x71c/0x990
Sep 1 04:42:36 ubuntu kernel: [147953.250544] [c000000003323e30] [c000000000009568] handle_page_fault+0x10/0x30
Sep 1 04:42:36 ubuntu kernel: [147953.257562] BUG: soft lockup - CPU#15 stuck for 59737s! [genload:22734]
Sep 1 04:42:36 ubuntu kernel: [147953.257647] Modules linked in: nfsv2 nfsv3 nfsd auth_rpcgss nfs_acl nfs lockd sunrpc fscache pseries_rng rtc_generic e1000 ohci_pci
Other details :
------------------
@ubuntu:/tmp$ lscpu
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 16
On-line CPU(s) list: 0-15
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 16
NUMA node(s): 1
Model: IBM pSeries (emulated by qemu)
L1d cache: 64K
L1i cache: 32K
NUMA node0 CPU(s): 0-15
@ubuntu:/tmp$ free
total used free shared buffers cached
Mem: 2072704 892480 1180224 448 274240 132480
-/+ buffers/cache: 485760 1586944
Swap: 3460160 35392 3424768
@ubuntu:/tmp$ uptime
05:22:02 up 1 day, 19:06, 2 users, load average: 10.67, 9.10, 9.32
Thanks
== Comment: #1 - ABDUL HALEEM <abdhalee@xxxxxxxxxx> - 2014-09-01
05:31:58 ==
== Comment: #2 - ABDUL HALEEM <abdhalee@xxxxxxxxxx> - 2014-09-01 05:36:48 ==
== Comment: #5 - MAMATHA INAMDAR <mainamdar@xxxxxxxxxx> - 2014-09-05 05:03:56 ==
Hi Abdul,
Are you able to recreate this issue?
Please update the bug with your latest test results.
== Comment: #6 - ABDUL HALEEM <abdhalee@xxxxxxxxxx> - 2014-09-10 05:55:47 ==
(In reply to comment #5)
> Hi Abdul,
> Are you able to recreate this issue?
> Please update the bug with your latest test results.
Hi Mamatha,
I have started the test again with xmon enabled.
will keep updating you on status.
Thanks
== Comment: #7 - ABDUL HALEEM <abdhalee@xxxxxxxxxx> - 2014-09-10 05:59:17 ==
I have started the test on 3.16.0-14-generic and I still see these messages in syslog
[ 8075.169576] Unable to find swap-space signature
[ 7452.105450] Unable to find swap-space signature
should we worry about this.
the original problem has not reproduced yet..will update the soon
== Comment: #8 - Dan Streetman <ddstreet@xxxxxxxxxx> - 2014-09-10 08:44:21 ==
(In reply to comment #7)
> I have started the test on 3.16.0-14-generic and I still see these messages
> in syslog
>
> [ 8075.169576] Unable to find swap-space signature
> [ 7452.105450] Unable to find swap-space signature
>
> should we worry about this.
It looks like you have some kind of tests creating/adding swap files,
and I have no idea what those tests look like, so I don't know if this
is an expected result of the tests or not. Generally that error means
you are trying to swapon a swap file that isn't correctly initialized
with mkswap, or it's header is corrupted.
Assuming your test isn't expecting a failure, you should just mkswap
again on whatever swap file is failing. It looks like "./swapfile01",
but since you're using relative paths, I can't tell you where it's
located.
== Comment: #9 - ABDUL HALEEM <abdhalee@xxxxxxxxxx> - 2014-09-11 04:09:13 ==
Hi,
I recreated the bug on latest kernel 3.16.0-14-generic
If I properly recall the scenario due to which kernel triggered soft
lockup - CPU#15 traces is
During my first test run, the next day I saw the guest was in 'paused'
state, as my host disk partition on which /var/lib/libvirt/images is
mounted was out of space, I freed up the disk space and resumed the
guest. Still i see my test were running, but dmesg showed the traces
messages.
So in my last run I recreated similar scenario with xmon=on and found
that the traces are triggered when I suspend and resume my guest when
test were running and not because of my actual test.
--- Actual steps to reproduce --
- enable xmon in /etc/default/grub and run 'update-grub' and 'reboot'
- Run ltpstress test
- suspend the guest 'virsh suspend <guest>'
- after few seconds resume. my test running fine
- dmesg showed the original traces messages as below
perhaps when the traces were triggered, the console did not fall to
xmon, I guess this might be a different problem.
I have kept the system in the same state.
Trace messages:
[84735.190787] Adding 576k swap on swapfile27. Priority:-27 extents:1 across:576k FS
[84735.740298] Adding 576k swap on swapfile28. Priority:-28 extents:1 across:576k FS
[84736.062528] Adding 576k swap on swapfile29. Priority:-29 extents:1 across:576k FS
[84924.032436] BUG: soft lockup - CPU#0 stuck for 104s! [float_bessel:10251]
[84924.032507] Modules linked in: nfsv2 nfsv3 nfsd auth_rpcgss nfs_acl nfs lockd sunrpc fscache pseries_rng rtc_generic shpchp ohci_pci e1000
[84924.032525] CPU: 0 PID: 10251 Comm: float_bessel Not tainted 3.16.0-14-generic #20-Ubuntu
[84924.032527] task: c000000003100000 ti: c00000003250c000 task.ti: c00000003250c000
[84924.032529] NIP: c0000000000110b4 LR: c0000000000110b4 CTR: 00003fffb4644120
[84924.032531] REGS: c00000003250fb90 TRAP: 0901 Not tainted (3.16.0-14-generic)
[84924.032532] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 22002444 XER: 00000000
[84924.032538] CFAR: 00003fffb4645888 SOFTE: 1
GPR00: c00000000000a704 c00000003250fe10 c0000000013d49e0 0000000000000900
GPR04: 0000000000040004 0000000000000000 00000000009c0000 00000000ff001009
GPR08: 000182dee8f4d56f 000000007fefffff 0000000040cc8595 0000000000000000
GPR12: 0000000000002200 00003fffab8658f0
[84924.032552] NIP [c0000000000110b4] arch_local_irq_restore+0x74/0x90
[84924.032554] LR [c0000000000110b4] arch_local_irq_restore+0x74/0x90
[84924.032556] Call Trace:
[84924.032557] [c00000003250fe10] [0000000000002856] 0x2856 (unreliable)
[84924.032561] [c00000003250fe30] [c00000000000a704] ret_from_except_lite+0x30/0x60
[84924.032562] Instruction dump:
[84924.032563] 994d02ba 2fa30000 409e0024 e92d0020 61298000 7d210164 38210020 e8010010
[84924.032566] 7c0803a6 4e800020 60420000 4bff1315 <60000000> 4bffffe4 60420000 e92d0020
[84926.062119] Adding 576k swap on ./swapfile01. Priority:-2 extents:1 across:576k FS
[84936.733247] Adding 65472k swap on ./swapfile01. Priority:-2 extents:2 across:114624k
Thanks
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1370421/+subscriptions