kernel-packages team mailing list archive
-
kernel-packages team
-
Mailing list archive
-
Message #143778
[Bug 1508767] Re: IBM POWER8 unhandled signal 11 / SEGV
7 day log rotate:
| hloeung@floette:~$ zgrep -h SEGV /var/log/syslog*
| Oct 28 14:46:34 floette kernel: [1351174.845829] init: rsyslog main process (2652) killed by SEGV signal
| hloeung@bagon:~$ zgrep -h SEGV /var/log/syslog*
| Nov 2 22:17:03 bagon kernel: [2401829.665556] init: neutron-plugin-openvswitch-agent main process (124418) killed by SEGV signal
| Nov 3 00:20:46 bagon kernel: [2409252.286349] init: nova-compute main process (125673) killed by SEGV signal
| Oct 26 11:51:25 bagon kernel: [1759496.565022] init: nova-compute main process (94922) killed by SEGV signal
| Oct 26 11:56:08 bagon kernel: [1759778.693294] init: neutron-plugin-openvswitch-agent main process (18574) killed by SEGV signal
| Oct 26 11:56:23 bagon kernel: [1759794.417232] init: neutron-plugin-openvswitch-agent main process (95171) killed by SEGV signal
| hloeung@gligar:~$ zgrep -h SEGV /var/log/syslog*
| Oct 30 18:32:56 gligar kernel: [745109.275184] init: neutron-plugin-openvswitch-agent main process (4705) killed by SEGV signal
| Oct 30 18:32:56 gligar kernel: [745109.776233] init: neutron-plugin-openvswitch-agent main process (88517) killed by SEGV signal
| Oct 30 18:32:57 gligar kernel: [745110.335622] init: neutron-plugin-openvswitch-agent main process (88527) killed by SEGV signal
| hloeung@patrat:~$ zgrep -h SEGV /var/log/syslog*
| Oct 27 08:18:29 patrat kernel: [508926.329315] init: neutron-plugin-openvswitch-agent main process (51113) killed by SEGV signal
I've disabled KSM as suggested. I'll try get wgrant or cjwatson to
trigger a full rebuild and get some load on these compute nodes.
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1508767
Title:
IBM POWER8 unhandled signal 11 / SEGV
Status in Ubuntu Cloud Archive:
New
Status in apparmor package in Ubuntu:
Invalid
Status in linux package in Ubuntu:
Confirmed
Status in linux-meta-lts-vivid package in Ubuntu:
New
Bug description:
Hi,
We have a few IBM POWER8 servers which we're currently using as
OpenStack nova compute nodes. It seems we're regularly running into
issues where processes are segfaulting:
| hloeung@gligar:~$ zgrep -E '(SEGV)|(unhandled signal 11)' /var/log/syslog.5.gz
| Oct 16 23:31:38 gligar kernel: [88351.465559] neutron-openvsw[29733]: unhandled signal 11 at 88f9010000000000 nip 00000000100ba0d8 lr 00000000101ad860 code 30001
| Oct 16 23:31:38 gligar kernel: [88351.566909] init: neutron-plugin-openvswitch-agent main process (29733) killed by SEGV signal
| Oct 16 23:31:38 gligar kernel: [88351.746611] apport[29500]: unhandled signal 11 at 8850e467250040a8 nip 0000000010201f80 lr 0000000010202984 code 30001
| Oct 16 23:31:39 gligar kernel: [88352.245829] neutron-rootwra[29749]: unhandled signal 11 at 0809c4b610000000 nip 000000001014ae4c lr 000000001014b544 code 30001
| Oct 16 23:31:50 gligar kernel: [88364.040340] neutron-rootwra[30060]: unhandled signal 11 at 08a305c12b000000 nip 00000000100b74d0 lr 00000000100b73e4 code 30001
| Oct 16 23:31:51 gligar kernel: [88364.174218] neutron-rootwra[30065]: unhandled signal 11 at 088eb28e2f004078 nip 00000000100b5974 lr 00000000100aa794 code 30001
| Oct 16 23:31:52 gligar kernel: [88365.195380] neutron-rootwra[30098]: unhandled signal 11 at 88c939e322000008 nip 00000000100c8b28 lr 0000000010060384 code 30001
| Oct 16 23:31:52 gligar kernel: [88365.362374] neutron-rootwra[30106]: unhandled signal 11 at 882c58ad2800f04f nip 00003fffaef81220 lr 00003fffaef811a0 code 30001
| Oct 16 23:32:27 gligar kernel: [88400.966976] neutron-rootwra[30341]: unhandled signal 11 at 88d1fbe922001008 nip 00000000100c8b28 lr 0000000010060384 code 30001
| Oct 16 23:32:47 gligar kernel: [88420.953053] neutron-rootwra[30412]: unhandled signal 11 at 11b6629054008000 nip 00003fff9a864ac4 lr 00003fff9a84c42c code 30001
| Oct 16 23:34:49 gligar kernel: [88542.778503] neutron-rootwra[30977]: unhandled signal 11 at 88540f00000010a8 nip 00000000100aa768 lr 00000000100b74e8 code 30001
| Oct 16 23:35:23 gligar kernel: [88576.700721] neutron-openvsw[29739]: unhandled signal 11 at 08bfcbf7210000a8 nip 00000000100ab390 lr 00000000100b7c38 code 30001
| Oct 16 23:35:23 gligar kernel: [88576.804961] init: neutron-plugin-openvswitch-agent main process (29739) killed by SEGV signal
| Oct 16 23:36:01 gligar kernel: [88614.995497] nova-compute[31662]: unhandled signal 11 at 8846c1c81f004008 nip 000000001014c2f0 lr 0000000010151080 code 30001
| Oct 16 23:36:02 gligar kernel: [88615.110735] nova-compute[4331]: unhandled signal 11 at 88befae9220010a8 nip 00000000100b5c8c lr 000000001014c734 code 30001
| Oct 16 23:36:02 gligar kernel: [88615.219436] init: nova-compute main process (4331) killed by SEGV signal
| Oct 17 03:59:56 gligar kernel: [104449.890256] landscape-packa[63283]: unhandled signal 11 at 02f0000000000008 nip 00000000101abeac lr 00000000100a8738 code 30001
| Oct 17 04:05:00 gligar kernel: [104753.718195] sudo[63915]: unhandled signal 11 at 08e06105d1dcfff8 nip 00003fffb15cf7e4 lr 00003fffb15cfa00 code 30001
| hloeung@floette:~$ zgrep -E '(SEGV)|(unhandled signal 11)' /var/log/syslog.7.gz
| Oct 14 16:55:30 floette kernel: [149326.697938] rsync[9915]: unhandled signal 11 at 00003ffff7cb0000 nip 00003fffa242d054 lr 00003fffa2426560 code 30001
| Oct 14 21:05:57 floette kernel: [164353.333697] apparmor_parser[102284]: unhandled signal 11 at 08680f0000000000 nip 000000001004bbf8 lr 0000000010028de4 code 30001
| Oct 14 22:21:24 floette kernel: [168880.481778] neutron-rootwra[153488]: unhandled signal 11 at 8860fbe21f0000a8 nip 00000000100aa768 lr 00000000100b74e8 code 30001
| Oct 14 22:21:26 floette kernel: [168882.078608] neutron-openvsw[4546]: unhandled signal 11 at 8822cbf03d000008 nip 00000000100aa764 lr 00000000100e6900 code 30001
| Oct 14 22:21:37 floette kernel: [168893.597834] init: neutron-plugin-openvswitch-agent main process (4546) killed by SEGV signal
| Oct 14 22:21:39 floette kernel: [168894.949777] nova-rootwrap[153708]: unhandled signal 11 at 88d495c93c0000a8 nip 00000000100a57d4 lr 00000000100ab42c code 30001
| Oct 14 22:21:43 floette kernel: [168898.973700] neutron-rootwra[153847]: unhandled signal 11 at 08c90df318000020 nip 00000000101ac260 lr 00000000101ad92c code 30001
| Oct 14 22:21:44 floette kernel: [168900.785421] neutron-rootwra[153850]: unhandled signal 11 at 88d87b783f0000a8 nip 00000000101abf40 lr 00000000100d9cac code 30001
| Oct 14 22:21:46 floette kernel: [168902.724121] neutron-openvsw[153852]: unhandled signal 11 at 882b78783f0000a8 nip 00000000100b5c8c lr 000000001014c734 code 30001
| hloeung@patrat:~$ zgrep -E '(SEGV)|(unhandled signal 11)' /var/log/syslog.7.gz
| Oct 15 00:48:13 patrat kernel: [553143.677075] rsync[89656]: unhandled signal 11 at 00003fffe6a50000 nip 00003fff77e0d054 lr 00003fff77e06560 code 30001
| Oct 16 02:42:03 wailmer kernel: [862104.157449] nova-compute[11431]: unhandled signal 11 at 081169bc370000a8 nip 00000000100ac164 lr 00000000100b7d6c code 30001
| Oct 16 02:42:03 wailmer kernel: [862104.264242] init: nova-compute main process (11431) killed by SEGV signal
| Oct 16 06:38:22 wailmer kernel: [876282.603855] qemu-img[78662]: unhandled signal 11 at 11b625104e000000 nip 00003fffb6224bb4 lr 00003fffb620c42c code 30001
| Oct 16 06:38:23 wailmer kernel: [876283.336045] qemu-system-ppc[78609]: unhandled signal 11 at ffffffc10000009a nip 00003fffae1a7124 lr 0000000010314874 code 30001
| Oct 16 06:39:40 wailmer kernel: [876360.399550] neutron-rootwra[79380]: unhandled signal 11 at 0800c20428000000 nip 00000000100a6c14 lr 00000000100a6d4c code 30001
| Oct 16 06:39:47 wailmer kernel: [876367.577184] neutron-rootwra[79676]: unhandled signal 11 at 0878a100000040a8 nip 00000000100aa768 lr 000000001004ed6c code 30001
| Oct 16 06:39:49 wailmer kernel: [876369.478066] neutron-openvsw[12655]: unhandled signal 11 at 088e47f11f000008 nip 00000000100db46c lr 00000000100db424 code 30001
| Oct 16 06:39:58 wailmer kernel: [876378.286827] init: neutron-plugin-openvswitch-agent main process (12655) killed by SEGV signal
| Oct 16 06:39:59 wailmer kernel: [876379.211801] sudo[79703]: unhandled signal 11 at 886baddd38005000 nip 886baddd38005000 lr 00003fff7da870a8 code 30001
| Oct 16 06:40:00 wailmer kernel: [876380.344562] libvirtd[109725]: unhandled signal 11 at 88806be02f000000 nip 00003fff78a70684 lr 00003fff78ab7a5c code 30001
| Oct 16 06:40:06 wailmer kernel: [876386.781123] init: libvirt-bin main process (109725) killed by SEGV signal
| Oct 16 06:40:06 wailmer kernel: [876386.818672] sudo[79919]: unhandled signal 11 at 11bda1eb70000000 nip 00003fff82094ac4 lr 00003fff8207c42c code 30001
| Oct 16 06:40:06 wailmer kernel: [876386.921414] neutron-openvsw[79689]: unhandled signal 11 at 88f8010000005000 nip 00000000100ba0d8 lr 00000000100c97c8 code 30001
| Oct 16 06:40:06 wailmer kernel: [876387.024431] init: neutron-plugin-openvswitch-agent main process (79689) killed by SEGV signal
These servers are all running Trusty with hwe-v kernel (3.19.0-31-generic #36~14.04.1-Ubuntu).
ProblemType: Crash
DistroRelease: Ubuntu 14.04
Package: nova-compute 1:2015.1.1-0ubuntu1~cloud2 [origin: Canonical]
ProcVersionSignature: Ubuntu 3.19.0-30.34~14.04.1-generic 3.19.8-ckt6
Uname: Linux 3.19.0-30-generic ppc64le
ApportVersion: 2.14.1-0ubuntu3.16
Architecture: ppc64el
CrashDB:
{
"impl": "launchpad",
"project": "cloud-archive",
"bug_pattern_url": "http://people.canonical.com/~ubuntu-archive/bugpatterns/bugpatterns.xml",
}
Date: Fri Oct 16 23:30:00 2015
ExecutablePath: /usr/bin/nova-compute
InterpreterPath: /usr/bin/python2.7
PackageArchitecture: all
ProcCmdline: /usr/bin/python /usr/bin/nova-compute --config-file=/etc/nova/nova.conf --config-file=/etc/nova/nova-compute.conf
ProcEnviron:
TERM=linux
PATH=(custom, no user)
ProcLoadAvg: 1.98 1.32 1.28 3/1516 7754
ProcSwaps:
Filename Type Size Used Priority
/swap.img file 8388544 0 -1
ProcVersion: Linux version 3.19.0-30-generic (buildd@fisher04) (gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) ) #34~14.04.1-Ubuntu SMP Fri Oct 2 22:21:52 UTC 2015
Signal: 6
SourcePackage: nova
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: libvirtd
cpu_cores: Number of cores present = 20
cpu_coreson: Number of cores online = 20
cpu_smt: SMT is off
---
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 Oct 22 03:34 seq
crw-rw---- 1 root audio 116, 33 Oct 22 03:34 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.14.1-0ubuntu3.18
Architecture: ppc64el
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
DistroRelease: Ubuntu 14.04
Lsusb:
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Package: linux-meta-lts-vivid
PciMultimedia:
ProcEnviron:
TERM=xterm
PATH=(custom, no user)
XDG_RUNTIME_DIR=<set>
LANG=en_GB
SHELL=/bin/bash
ProcFB:
ProcKernelCmdLine: root=UUID=fcd256a9-8aa6-4805-95ae-f8c635967753 ro console=ttyS1
ProcLoadAvg: 3.77 2.83 2.55 3/1574 89091
ProcSwaps:
Filename Type Size Used Priority
/swap.img file 8388544 0 -1
ProcVersion: Linux version 3.19.0-31-generic (buildd@fisher04) (gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) ) #36~14.04.1-Ubuntu SMP Thu Oct 8 10:25:49 UTC 2015
ProcVersionSignature: Ubuntu 3.19.0-31.36~14.04.1-generic 3.19.8-ckt7
RelatedPackageVersions:
linux-restricted-modules-3.19.0-31-generic N/A
linux-backports-modules-3.19.0-31-generic N/A
linux-firmware 1.127.16
RfKill: Error: [Errno 2] No such file or directory
Tags: trusty uec-images
Uname: Linux 3.19.0-31-generic ppc64le
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm
_MarkForUpload: True
cpu_cores: Number of cores present = 20
cpu_coreson: Number of cores online = 20
cpu_dscr: DSCR is 0
cpu_freq:
min: 2.016 GHz (cpu 80)
max: 3.691 GHz (cpu 32)
avg: 3.527 GHz
cpu_runmode:
Could not retrieve current diagnostics mode,
No firmware implementation of function
cpu_smt: SMT is off
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1508767/+subscriptions