kernel-packages team mailing list archive
-
kernel-packages team
-
Mailing list archive
-
Message #119483
[Bug 1450584] Re: mono occassionally crashes since kernel 3.13.0-48 on multi-cpu vm
On my website www.medinet.fr, I am using Ubuntu Server and Mono. I
downgraded the serverto one CPU for having no crashes and waiting for
the right fix. The said fixed one does not work at all. The server
crashes eash 10 minutes max. Regards. M. Badr CHOUFFAI. .Net Software
Architect.
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1450584
Title:
mono occassionally crashes since kernel 3.13.0-48 on multi-cpu vm
Status in linux package in Ubuntu:
In Progress
Status in linux source package in Trusty:
Fix Committed
Status in linux source package in Utopic:
Fix Committed
Status in linux source package in Vivid:
Fix Committed
Bug description:
[Impact]
The addition of the commit:
http://kernel.ubuntu.com/git/ubuntu/ubuntu-trusty.git/commit/?id=11f4e0339c8dc8d760483258efd9f15b4c6dcda2
Causes SIGSEGVs when running certain workloads on multi-cpu VMs.
[Test Case]
Mono test case here that causes the SIGSEGV
https://bugzilla.xamarin.com/show_bug.cgi?id=29212
[Fix]
These two commits are required for fixing this issue:
https://github.com/torvalds/linux/commit/80f7fdb1c7f0f9266421f823964fd1962681f6ce
https://github.com/torvalds/linux/commit/0a4e6be9ca17c54817cf814b4b5aa60478c6df27
--
Gradually since late March more and more users started to complain
about frequent SIGSEGV crashes in our .net/mono application. Early
April I started to investigate it actively.
After eliminating possible native libraries, and testing various mono
versions I discovered the crashes would occur more frequently on a
vbox vm with multiple cpus configured. And discovered that the mono
bug-18026.cs testcase would fairly consistently crash. At that point
it was reported to the mono bug tracker.
I finally got a break when we found a correlation with the kernel version. 3.13.0-46 didn't crash while 3.13.0-48,49 did.
More and more users upgrade to these newer kernel versions and start running into issues, which explains the gradual increase in reports.
Early this week I performed a full git bisect on the kernel between 3.13.0-46 and -48 and isolated the commit that seems to trigger the crashes.
Namely http://kernel.ubuntu.com/git/ubuntu/ubuntu-trusty.git/commit/?id=11f4e0339c8dc8d760483258efd9f15b4c6dcda2
At this point I don't know if the commit messed up something, or that mono simply handles it incorrectly. However, a few commits for linux 4.x seem to fix it:
https://github.com/torvalds/linux/commit/80f7fdb1c7f0f9266421f823964fd1962681f6ce
https://github.com/torvalds/linux/commit/0a4e6be9ca17c54817cf814b4b5aa60478c6df27
I applied these commits myself on top of commit 11f4e033, compiled and ran the testcase... didn't crash in the 200x test runs I did.
Although I don't know if those two patches have unknown side-effects.
I'm not an expert on the kernel, not even remotely. But I thought it would be nice to be able to point at a possible solution.
My current test vm is a virtualbox vm 64bit installed using the 14.04.2 server iso running on an older i7 quad core Windows 7 64bit host.
In the vm I've tested numerous mono and kernel combinations. Last test was with kernel 3.16.0-36 and 3.13.0-51 and mono 4.0.1, in which the problem still occurs.
By now I've debugged the app using gdb several dozen times on various
user setups, compiled mono half a dozen times, and then the 8x3h
compile kernel bisect :) Speaking of down the rabbit-hole...
So I'm pretty desperate for some expert to help me out here. :D
Reference to mono bug report:
https://bugzilla.xamarin.com/show_bug.cgi?id=29212
ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: linux-image-3.13.0-51-generic 3.13.0-51.84
ProcVersionSignature: Ubuntu 3.13.0-51.84-generic 3.13.11-ckt18
Uname: Linux 3.13.0-51-generic x86_64
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 Apr 30 18:53 seq
crw-rw---- 1 root audio 116, 33 Apr 30 18:53 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.14.1-0ubuntu3.10
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory: 'iw'
CurrentDmesg: [ 9.379188] init: plymouth-upstart-bridge main process ended, respawning
Date: Thu Apr 30 19:45:43 2015
HibernationDevice: RESUME=UUID=b35ef328-166d-4476-a418-e7e80d22cb30
InstallationDate: Installed on 2015-04-22 (7 days ago)
InstallationMedia: Ubuntu-Server 14.04.2 LTS "Trusty Tahr" - Release amd64 (20150218.1)
IwConfig:
eth0 no wireless extensions.
lo no wireless extensions.
Lsusb:
Bus 001 Device 002: ID 80ee:0021 VirtualBox USB Tablet
Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: innotek GmbH VirtualBox
ProcEnviron:
TERM=screen
PATH=(custom, no user)
XDG_RUNTIME_DIR=<set>
LANG=en_US.UTF-8
SHELL=/bin/bash
ProcFB: 0 VESA VGA
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-51-generic root=UUID=68da7e09-1a91-4107-859d-bf452f9ed992 ro
RelatedPackageVersions:
linux-restricted-modules-3.13.0-51-generic N/A
linux-backports-modules-3.13.0-51-generic N/A
linux-firmware 1.127.11
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 12/01/2006
dmi.bios.vendor: innotek GmbH
dmi.bios.version: VirtualBox
dmi.board.name: VirtualBox
dmi.board.vendor: Oracle Corporation
dmi.board.version: 1.2
dmi.chassis.type: 1
dmi.chassis.vendor: Oracle Corporation
dmi.modalias: dmi:bvninnotekGmbH:bvrVirtualBox:bd12/01/2006:svninnotekGmbH:pnVirtualBox:pvr1.2:rvnOracleCorporation:rnVirtualBox:rvr1.2:cvnOracleCorporation:ct1:cvr:
dmi.product.name: VirtualBox
dmi.product.version: 1.2
dmi.sys.vendor: innotek GmbH
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1450584/+subscriptions
References