← Back to team overview

kernel-packages team mailing list archive

[Bug 1450584] Re: mono occassionally crashes since kernel 3.13.0-48 on multi-cpu vm

 

This bug was fixed in the package linux - 3.13.0-54.91

---------------
linux (3.13.0-54.91) trusty; urgency=medium

  [ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1458618

  [ Upstream Kernel Changes ]

  * [3.13-stable only] Revert "gianfar: Carefully free skbs in functions
    called by netpoll."
    - LP: #1454746

linux (3.13.0-54.90) trusty; urgency=low

  [ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1458618

  [ Andy Whitcroft ]

  * [Config] push off linux-lts-{utopic, vivid}-tools-common
    - LP: #1405807

  [ Brad Figg ]

  * hyper-v -- add hid and fb drivers to linux-virtual
    - LP: #1444179

  [ Upstream Kernel Changes ]

  * DT doc: net: cpsw mac-address is optional
    - LP: #1452628
  * net: cpsw: Add missing return value
    - LP: #1452628
  * net: cpsw: header, Add missing include
    - LP: #1452628
  * net: cpsw: Add am33xx MACID readout
    - LP: #1452628
  * am33xx: define syscon control module device node
    - LP: #1452628
  * arm: dts: am33xx, Add syscon phandle to cpsw node
    - LP: #1452628
  * net: cpsw: do not register cpts twice
    - LP: #1452620
  * x86: kvm: Revert "remove sched notifier for cross-cpu migrations"
    - LP: #1450584
  * x86: vdso: fix pvclock races with task migration
    - LP: #1450584
  * n_tty: Fix read buffer overwrite when no newline
    - LP: #1381005, #1454746
  * KVM: x86: Fix lost interrupt on irr_pending race
    - LP: #1454746
  * writeback: add missing INITIAL_JIFFIES init in
    global_update_bandwidth()
    - LP: #1454746
  * nbd: fix possible memory leak
    - LP: #1454746
  * mfd: kempld-core: Fix callback return value check
    - LP: #1454746
  * KVM: nVMX: mask unrestricted_guest if disabled on L0
    - LP: #1454746
  * spi: trigger trace event for message-done before mesg->complete
    - LP: #1454746
  * powerpc/pseries: Little endian fixes for post mobility device tree
    update
    - LP: #1454746
  * net: ethernet: pcnet32: Setup the SRAM and NOUFLO on Am79C97{3, 5}
    - LP: #1454746
  * perf: Fix irq_work 'tail' recursion
    - LP: #1454746
  * arm64: Use the reserved TTBR0 if context switching to the init_mm
    - LP: #1454746
  * selinux: fix sel_write_enforce broken return value
    - LP: #1454746
  * mm: fix anon_vma->degree underflow in anon_vma endless growing
    prevention
    - LP: #1454746
  * mm/memory hotplug: postpone the reset of obsolete pgdat
    - LP: #1454746
  * hfsplus: fix B-tree corruption after insertion at position 0
    - LP: #1454746
  * ARC: SA_SIGINFO ucontext regs off-by-one
    - LP: #1454746
  * writeback: fix possible underflow in write bandwidth calculation
    - LP: #1454746
  * iio: fix drivers that check buffer->scan_mask
    - LP: #1454746
  * iio: inv_mpu6050: Clear timestamps fifo while resetting hardware fifo
    - LP: #1454746
  * iio: core: Fix double free.
    - LP: #1454746
  * USB: ftdi_sio: Added custom PID for Synapse Wireless product
    - LP: #1454746
  * iwlwifi: dvm: run INIT firmware again upon .start()
    - LP: #1454746
  * USB: keyspan_pda: add new device id
    - LP: #1454746
  * cifs: smb2_clone_range() - exit on unhandled error
    - LP: #1454746
  * cifs: fix use-after-free bug in find_writable_file
    - LP: #1454746
  * can: flexcan: Deferred on Regulator return EPROBE_DEFER
    - LP: #1454746
  * usb: xhci: handle Config Error Change (CEC) in xhci driver
    - LP: #1454746
  * usb: xhci: apply XHCI_AVOID_BEI quirk to all Intel xHCI controllers
    - LP: #1454746
  * USB: ftdi_sio: Use jtag quirk for SNAP Connect E10
    - LP: #1454746
  * tty: serial: fsl_lpuart: clear receive flag on FIFO flush
    - LP: #1454746
  * radeon: Do not directly dereference pointers to BIOS area.
    - LP: #1454746
  * iio: imu: Use iio_trigger_get for indio_dev->trig assignment
    - LP: #1454746
  * dmaengine: edma: fix memory leak when terminating running transfers
    - LP: #1454746
  * dmaengine: omap-dma: Fix memory leak when terminating running transfer
    - LP: #1454746
  * x86/reboot: Add ASRock Q1900DC-ITX mainboard reboot quirk
    - LP: #1454746
  * mac80211: fix RX A-MPDU session reorder timer deletion
    - LP: #1454746
  * tcp: prevent fetching dst twice in early demux code
    - LP: #1454746
  * net: use for_each_netdev_safe() in rtnl_group_changelink()
    - LP: #1454746
  * xen-netfront: transmit fully GSO-sized packets
    - LP: #1454746
  * tcp: fix FRTO undo on cumulative ACK of SACKed range
    - LP: #1454746
  * PCI: cpcihp: Add missing curly braces in cpci_configure_slot()
    - LP: #1454746
  * sh_veu: v4l2_dev wasn't set
    - LP: #1454746
  * media: s5p-mfc: fix mmap support for 64bit arch
    - LP: #1454746
  * cpuidle: ACPI: do not overwrite name and description of C0
    - LP: #1454746
  * ioctx_alloc(): fix vma (and file) leak on failure
    - LP: #1454746
  * ALSA: hda/realtek - Make more stable to get pin sense for ALC283
    - LP: #1454746
  * be2iscsi: Fix kernel panic when device initialization fails
    - LP: #1454746
  * Defer processing of REQ_PREEMPT requests for blocked devices
    - LP: #1454746
  * ALSA: hda - Fix headphone pin config for Lifebook T731
    - LP: #1454746
  * ocfs2: _really_ sync the right range
    - LP: #1454746
  * ALSA: usb - Creative USB X-Fi Pro SB1095 volume knob support
    - LP: #1454746
  * iscsi target: fix oops when adding reject pdu
    - LP: #1454746
  * net/mlx4_en: Call register_netdevice in the proper location
    - LP: #1454746
  * ipv6: protect skb->sk accesses from recursive dereference inside the
    stack
    - LP: #1454746
  * tcp: tcp_make_synack() should clear skb->tstamp
    - LP: #1454746
  * 8139cp: Call dev_kfree_skby_any instead of kfree_skb.
    - LP: #1454746
  * 8139too: Call dev_kfree_skby_any instead of dev_kfree_skb.
    - LP: #1454746
  * r8169: Call dev_kfree_skby_any instead of dev_kfree_skb.
    - LP: #1454746
  * bonding: Call dev_kfree_skby_any instead of kfree_skb.
    - LP: #1454746
  * bnx2: Call dev_kfree_skby_any instead of dev_kfree_skb.
    - LP: #1454746
  * tg3: Call dev_kfree_skby_any instead of dev_kfree_skb.
    - LP: #1454746
  * ixgb: Call dev_kfree_skby_any instead of dev_kfree_skb.
    - LP: #1454746
  * benet: Call dev_kfree_skby_any instead of kfree_skb.
    - LP: #1454746
  * gianfar: Carefully free skbs in functions called by netpoll.
    - LP: #1454746
  * ip_forward: Drop frames with attached skb->sk
    - LP: #1454746
  * tcp: fix possible deadlock in tcp_send_fin()
    - LP: #1454746
  * tcp: avoid looping in tcp_send_fin()
    - LP: #1454746
  * net: do not deplete pfmemalloc reserve
    - LP: #1454746
  * net: fix crash in build_skb()
    - LP: #1454746
  * ipv4: Missing sk_nulls_node_init() in ping_unhash().
    - LP: #1454746
  * Linux 3.13.11-ckt20
    - LP: #1454746
  * of: Add support for ePAPR "stdout-path" property
    - LP: #1438585
  * lib: add glibc style strchrnul() variant
    - LP: #1438585
  * of: Create unlocked version of for_each_child_of_node()
    - LP: #1438585
  * of: Make of_find_node_by_path() handle /aliases
    - LP: #1438585
  * of: Create of_console_check() for selecting a console specified in
    /chosen
    - LP: #1438585
  * of: Enable console on serial ports specified by /chosen/stdout-path
    - LP: #1438585
  * of: correct of_console_check()'s return value
    - LP: #1438585
  * of: Add bindings for chosen node, stdout-path
    - LP: #1438585
  * of: add optional options parameter to of_find_node_by_path()
    - LP: #1438585
  * of: support passing console options with stdout-path
    - LP: #1438585
  * (upstream) net/mlx4_core: Adjust command timeouts to conform to the
    firmware spec
    - LP: #1455121
  * arm64: kernel: add MPIDR_EL1 accessors macros
    - LP: #1455372
  * of: reimplement the matching method for __of_match_node()
    - LP: #1455372
  * arm64: remove redundant "psci:" prefixes
    - LP: #1455372
  * arm64: remove return value form psci_init()
    - LP: #1455372
  * arm: KVM: Don't return PSCI_INVAL if waitqueue is inactive
    - LP: #1455372
  * KVM: Add capability to advertise PSCI v0.2 support
    - LP: #1455372
  * ARM/ARM64: KVM: Add common header for PSCI related defines
    - LP: #1455372
  * ARM/ARM64: KVM: Add base for PSCI v0.2 emulation
    - LP: #1455372
  * KVM: Documentation: Add info regarding KVM_ARM_VCPU_PSCI_0_2 feature
    - LP: #1455372
  * ARM/ARM64: KVM: Make kvm_psci_call() return convention more flexible
    - LP: #1455372
  * KVM: Add KVM_EXIT_SYSTEM_EVENT to user space API header
    - LP: #1455372
  * ARM/ARM64: KVM: Emulate PSCI v0.2 SYSTEM_OFF and SYSTEM_RESET
    - LP: #1455372
  * ARM/ARM64: KVM: Emulate PSCI v0.2 AFFINITY_INFO
    - LP: #1455372
  * ARM/ARM64: KVM: Emulate PSCI v0.2 MIGRATE_INFO_TYPE and related
    functions
    - LP: #1455372
  * ARM/ARM64: KVM: Fix CPU_ON emulation for PSCI v0.2
    - LP: #1455372
  * ARM/ARM64: KVM: Emulate PSCI v0.2 CPU_SUSPEND
    - LP: #1455372
  * ARM/ARM64: KVM: Advertise KVM_CAP_ARM_PSCI_0_2 to user space
    - LP: #1455372
  * PSCI: Add initial support for PSCIv0.2 functions
    - LP: #1455372
  * Documentation: devicetree: Add new binding for PSCIv0.2
    - LP: #1455372
  * ARM: Check if a CPU has gone offline
    - LP: #1455372
  * arm64: KVM: Enable minimalistic support for Cortex-A53
    - LP: #1455372
  * HID: multitouch: add support of clickpads
    - LP: #1456881
  * vhost/scsi: potential memory corruption
    - LP: #1457807
    - CVE-2015-4036

 -- Luis Henriques <luis.henriques@xxxxxxxxxxxxx>  Tue, 26 May 2015
17:19:30 +0100

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1450584

Title:
  mono occassionally crashes since kernel 3.13.0-48 on multi-cpu vm

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Trusty:
  Fix Released
Status in linux source package in Utopic:
  Fix Released
Status in linux source package in Vivid:
  Fix Released

Bug description:
  [Impact]
  The addition of the commit:
  http://kernel.ubuntu.com/git/ubuntu/ubuntu-trusty.git/commit/?id=11f4e0339c8dc8d760483258efd9f15b4c6dcda2

  Causes SIGSEGVs when running certain workloads on multi-cpu VMs.

  [Test Case]

  Mono test case here that causes the SIGSEGV
  https://bugzilla.xamarin.com/show_bug.cgi?id=29212

  [Fix]

  These two commits are required for fixing this issue:
  https://github.com/torvalds/linux/commit/80f7fdb1c7f0f9266421f823964fd1962681f6ce
  https://github.com/torvalds/linux/commit/0a4e6be9ca17c54817cf814b4b5aa60478c6df27

  --

  Gradually since late March more and more users started to complain
  about frequent SIGSEGV crashes in our .net/mono application. Early
  April I started to investigate it actively.

  After eliminating possible native libraries, and testing various mono
  versions I discovered the crashes would occur more frequently on a
  vbox vm with multiple cpus configured. And discovered that the mono
  bug-18026.cs testcase would fairly consistently crash. At that point
  it was reported to the mono bug tracker.

  I finally got a break when we found a correlation with the kernel version. 3.13.0-46 didn't crash while 3.13.0-48,49 did.
  More and more users upgrade to these newer kernel versions and start running into issues, which explains the gradual increase in reports.

  Early this week I performed a full git bisect on the kernel between 3.13.0-46 and -48 and isolated the commit that seems to trigger the crashes.
  Namely http://kernel.ubuntu.com/git/ubuntu/ubuntu-trusty.git/commit/?id=11f4e0339c8dc8d760483258efd9f15b4c6dcda2

  At this point I don't know if the commit messed up something, or that mono simply handles it incorrectly. However, a few commits for linux 4.x seem to fix it:
  https://github.com/torvalds/linux/commit/80f7fdb1c7f0f9266421f823964fd1962681f6ce
  https://github.com/torvalds/linux/commit/0a4e6be9ca17c54817cf814b4b5aa60478c6df27
  I applied these commits myself on top of commit 11f4e033, compiled and ran the testcase... didn't crash in the 200x test runs I did.
  Although I don't know if those two patches have unknown side-effects.
  I'm not an expert on the kernel, not even remotely. But I thought it would be nice to be able to point at a possible solution.

  My current test vm is a virtualbox vm 64bit installed using the 14.04.2 server iso running on an older i7 quad core Windows 7 64bit host.
  In the vm I've tested numerous mono and kernel combinations. Last test was with kernel 3.16.0-36 and 3.13.0-51 and mono 4.0.1, in which the problem still occurs.

  By now I've debugged the app using gdb several dozen times on various
  user setups, compiled mono half a dozen times, and then the 8x3h
  compile kernel bisect :) Speaking of down the rabbit-hole...

  So I'm pretty desperate for some expert to help me out here. :D

  Reference to mono bug report:
  https://bugzilla.xamarin.com/show_bug.cgi?id=29212

  ProblemType: Bug
  DistroRelease: Ubuntu 14.04
  Package: linux-image-3.13.0-51-generic 3.13.0-51.84
  ProcVersionSignature: Ubuntu 3.13.0-51.84-generic 3.13.11-ckt18
  Uname: Linux 3.13.0-51-generic x86_64
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 Apr 30 18:53 seq
   crw-rw---- 1 root audio 116, 33 Apr 30 18:53 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.14.1-0ubuntu3.10
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
  CRDA: Error: [Errno 2] No such file or directory: 'iw'
  CurrentDmesg: [    9.379188] init: plymouth-upstart-bridge main process ended, respawning
  Date: Thu Apr 30 19:45:43 2015
  HibernationDevice: RESUME=UUID=b35ef328-166d-4476-a418-e7e80d22cb30
  InstallationDate: Installed on 2015-04-22 (7 days ago)
  InstallationMedia: Ubuntu-Server 14.04.2 LTS "Trusty Tahr" - Release amd64 (20150218.1)
  IwConfig:
   eth0      no wireless extensions.

   lo        no wireless extensions.
  Lsusb:
   Bus 001 Device 002: ID 80ee:0021 VirtualBox USB Tablet
   Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
  MachineType: innotek GmbH VirtualBox
  ProcEnviron:
   TERM=screen
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=<set>
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 VESA VGA
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-51-generic root=UUID=68da7e09-1a91-4107-859d-bf452f9ed992 ro
  RelatedPackageVersions:
   linux-restricted-modules-3.13.0-51-generic N/A
   linux-backports-modules-3.13.0-51-generic  N/A
   linux-firmware                             1.127.11
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 12/01/2006
  dmi.bios.vendor: innotek GmbH
  dmi.bios.version: VirtualBox
  dmi.board.name: VirtualBox
  dmi.board.vendor: Oracle Corporation
  dmi.board.version: 1.2
  dmi.chassis.type: 1
  dmi.chassis.vendor: Oracle Corporation
  dmi.modalias: dmi:bvninnotekGmbH:bvrVirtualBox:bd12/01/2006:svninnotekGmbH:pnVirtualBox:pvr1.2:rvnOracleCorporation:rnVirtualBox:rvr1.2:cvnOracleCorporation:ct1:cvr:
  dmi.product.name: VirtualBox
  dmi.product.version: 1.2
  dmi.sys.vendor: innotek GmbH

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1450584/+subscriptions


References