kernel-packages team mailing list archive
-
kernel-packages team
-
Mailing list archive
-
Message #139862
Re: [Bug 1499203] Re: memory leak in hv_storvsc (3.13.0-63-generic)
On Friday, October 09, 2015 at 06:59, Oskar Liljeblad wrote:
> > > To see if it is the cause of this issue, I built a test kernel with a
> > > revert of commit 97b2591. The test kernel can be downloaded from:
> > >
> > > http://kernel.ubuntu.com/~jsalisbury/lp1499203/
[..]
> The 3.13.0-66.107~lp1445195Commit97b2591Reverted kernel seem to work just
> fine. No memory leaks as far as I can see.
By the way, I had to downgrade the kernel above to 3.13.0-65.106 on one
server because of some strange IO lockup issues. I'm afraid this won't be
of much help, but I'm writing it anyway.
It started 1 minute after boot with the new kernel:
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.106544] BUG: unable to handle kernel NULL pointer dereference at (null)
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.106592] IP: [<ffffffff81206c5b>] eventpoll_release_file+0x2b/0xa0
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.106624] PGD 1f72db067 PUD 1fa753067 PMD 0
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.106659] Oops: 0000 [#1] SMP
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.106684] Modules linked in: joydev hid_generic mac_hid serio_raw crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd nls_iso8859_1 hid_hyperv hyperv_fb hid hyperv_keyboard lp parport hv_netvsc hv_utils hv_storvsc hv_vmbus
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.106848] CPU: 1 PID: 1286 Comm: mongod Not tainted 3.13.0-66-generic #107~lp1445195Commit97b2591Reverted
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.106884] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.106923] task: ffff8801f722c800 ti: ffff8801f72ce000 task.ti: ffff8801f72ce000
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.106950] RIP: 0010:[<ffffffff81206c5b>] [<ffffffff81206c5b>] eventpoll_release_file+0x2b/0xa0
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.106986] RSP: 0018:ffff8801f72cfe78 EFLAGS: 00010246
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.107006] RAX: 0000000000000000 RBX: ffff8801f775e300 RCX: 0000000040000010
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.107032] RDX: 0000000001000000 RSI: 0000000000000000 RDI: ffffffff81c72e80
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.107058] RBP: ffff8801f72cfea0 R08: 0000000000000000 R09: 0000000000000001
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.107084] R10: ffff8801f775ece1 R11: 0000000000000293 R12: 0000000000000010
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.107110] R13: ffff8801f775ece1 R14: ffff8801f775ee40 R15: ffff8801f775e3b0
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.107137] FS: 00007f23b299f700(0000) GS:ffff8801fee20000(0000) knlGS:0000000000000000
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.107166] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.107190] CR2: 0000000000000000 CR3: 00000001f7a94000 CR4: 00000000001406e0
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.107224] Stack:
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.107235] ffff8801f775e300 0000000000000010 ffff8801f775ece1 ffff8801f775ee40
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.107270] ffff880036927a40 ffff8801f72cfee8 ffffffff811bfb7a ffffffff8133ed81
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.107302] ffff8801fa8bbe30 0000000000000000 ffffffff81ebb680 ffff8801f722ce20
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.107336] Call Trace:
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.107353] [<ffffffff811bfb7a>] __fput+0x24a/0x260
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.107375] [<ffffffff8133ed81>] ? blkdev_issue_flush+0x71/0x90
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.107400] [<ffffffff811bfbde>] ____fput+0xe/0x10
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.107421] [<ffffffff81088377>] task_work_run+0xa7/0xe0
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.107444] [<ffffffff81013e57>] do_notify_resume+0x97/0xb0
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.107468] [<ffffffff8173431a>] int_signal+0x12/0x17
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.107491] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 49 89 ff 48 c7 c7 80 2e c7 81 49 81 c7 b0 00 00 00 41 56 41 55 41 54 53 e8 b8 30 52 00 49 8b 07 <48> 8b 08 49 39 c7 4c 8d 60 a8 48 8d 59 a8 75 0b eb 3e 0f 1f 00
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.107648] RIP [<ffffffff81206c5b>] eventpoll_release_file+0x2b/0xa0
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.107675] RSP <ffff8801f72cfe78>
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.107689] CR2: 0000000000000000
Oct 13 00:06:16 af-mdbdrs2 kernel: [ 66.107717] ---[ end trace 87deccc21e1958fa ]---
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.210565] ------------[ cut here ]------------
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.210612] kernel BUG at /home/jsalisbury/bugs/lp1499203/ubuntu-trusty/mm/rmap.c:1035!
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.210642] invalid opcode: 0000 [#2] SMP
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.210663] Modules linked in: joydev hid_generic mac_hid serio_raw crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd nls_iso8859_1 hid_hyperv hyperv_fb hid hyperv_keyboard lp parport hv_netvsc hv_utils hv_storvsc hv_vmbus
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.210796] CPU: 1 PID: 1771 Comm: mongod Tainted: G D 3.13.0-66-generic #107~lp1445195Commit97b2591Reverted
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.210834] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.210873] task: ffff8801f7713000 ti: ffff8801fafa4000 task.ti: ffff8801fafa4000
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.210900] RIP: 0010:[<ffffffff8171ee8a>] [<ffffffff8171ee8a>] __page_set_anon_rmap.part.22+0x9/0xb
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.210939] RSP: 0018:ffff8801fafa59e8 EFLAGS: 00010246
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.210960] RAX: 0000000000000000 RBX: ffffea00079a2340 RCX: ffffffffffffffe8
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.210986] RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff880207ff4f00
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.211021] RBP: ffff8801fafa59e8 R08: 00000000fffffff9 R09: 0000000000000000
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.212294] R10: 000000000000000c R11: 00000000003e9480 R12: 00007f084a5619e0
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214126] R13: 0000000000000000 R14: ffff8801f775e300 R15: 0000000000000000
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] FS: 00007f084a561700(0000) GS:ffff8801fee20000(0000) knlGS:0000000000000000
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] CR2: 00007f084a5619e0 CR3: 00000001f7a94000 CR4: 00000000001406e0
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] Stack:
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] ffff8801fafa5a18 ffffffff8118464a 00007f084a5619e0 ffff8800f78ea290
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] ffff8801f775e300 ffff8801fa652300 ffff8801fafa5ab0 ffffffff8117a708
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] ffff880035aab300 0000000035aab300 0000000000000000 0000000000001f4a
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] Call Trace:
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] [<ffffffff8118464a>] do_page_add_anon_rmap+0x10a/0x120
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] [<ffffffff8117a708>] handle_mm_fault+0xcf8/0xf00
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] [<ffffffff8172f624>] __do_page_fault+0x184/0x560
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] [<ffffffff810a3281>] ? update_cfs_shares+0xb1/0x100
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] [<ffffffff8109ee48>] ? __enqueue_entity+0x78/0x80
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] [<ffffffff810a51dd>] ? enqueue_entity+0x2ad/0xbb0
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] [<ffffffff8101bb33>] ? native_sched_clock+0x13/0x80
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] [<ffffffff810a5f02>] ? enqueue_task_fair+0x422/0x6d0
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] [<ffffffff8172fa1a>] do_page_fault+0x1a/0x70
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] [<ffffffff8172bd68>] page_fault+0x28/0x30
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] [<ffffffff8137184f>] ? __get_user_8+0x1f/0x29
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] [<ffffffff810db202>] ? exit_robust_list+0x32/0x130
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] [<ffffffff81064a53>] mm_release+0x123/0x140
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] [<ffffffff81069b43>] do_exit+0x153/0xa40
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] [<ffffffff8106a4af>] do_group_exit+0x3f/0xa0
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] [<ffffffff8107a190>] get_signal_to_deliver+0x1d0/0x6d0
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] [<ffffffff810133f8>] do_signal+0x48/0xa10
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] [<ffffffff81179e92>] ? handle_mm_fault+0x482/0xf00
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] [<ffffffff81013e29>] do_notify_resume+0x69/0xb0
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] [<ffffffff8172bb62>] retint_signal+0x48/0x86
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] Code: c4 40 74 03 8b 4f 68 bf 00 10 00 00 48 d3 e7 e8 2d 58 a7 ff 5d c3 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 0f 1f 44 00 00 55 48 89 e5 <0f> 0b 0f 1f 44 00 00 55 48 89 e5 0f 0b 55 89 f2 be 00 80 00 00
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] RIP [<ffffffff8171ee8a>] __page_set_anon_rmap.part.22+0x9/0xb
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.214539] RSP <ffff8801fafa59e8>
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.249727] ---[ end trace 87deccc21e1958fb ]---
Oct 13 00:08:51 af-mdbdrs2 kernel: [ 221.251013] Fixing recursive fault but reboot is needed!
After that all IO on that device stuck.
I rebooted the server and the issue occurred again, basically the same messages logged.
Regards,
Oskar Liljeblad
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1499203
Title:
memory leak in hv_storvsc (3.13.0-63-generic)
Status in linux package in Ubuntu:
Confirmed
Status in linux source package in Trusty:
Confirmed
Bug description:
Slab and SUnreclaim values in /proc/meminfo keep increasing. On one
servers it reached 85% of physical memory after 14 days - but on most
other servers it increases more slowly. I checked /proc/slabinfo and
almost all allocations were in kmalloc-512. So I enabled
"slub_debug=U,kmalloc-512" on one server, and after only 24h of uptime
11% of the memory was used by kmalloc-512 and unreclaimable. With
debugging enabled I could see the following in
/sys/kernel/slab/kmalloc-512/alloc_calls:
521294 storvsc_queuecommand+0x359/0x790 [hv_storvsc]
age=161922/955116/20882927 pid=1-41545
All other counters were below 2000. In
/sys/kernel/slab/kmalloc-512/free_calls I see the following:
516823 <not-available> age=4315783846 pid=0
The hv_storvsc module is for Hyper-V. We are (unfortunately) running
Hyper-V 6.3.9600.16384 with Microsoft System Center 2012 R2 Update
rollup 3 for all the servers with this issue.
Kernels are stock linux-image-3.13.0-63-generic, 3.13.0-63.103,
x86_64, from Ubuntu 14.04 LTS . /proc/version_signature contains:
Ubuntu 3.13.0-63.103-generic 3.13.11-ckt25
No output from lspci -vnvn. The problem described above happens on
both single and multicore virtual machines. CPU in hypervisors are
E5-2630 v2 @ 2.60GHz. Let me know if you need more info or if I can do
more debugging.
Regards,
Oskar Liljeblad
---
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 Sep 24 00:31 seq
crw-rw---- 1 root audio 116, 33 Sep 24 00:31 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.14.1-0ubuntu3.13
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory
CurrentDmesg:
[59081.977909] systemd-udevd[26480]: starting version 204
[59124.051974] init: systemd-logind main process (756) killed by TERM signal
DistroRelease: Ubuntu 14.04
InstallationDate: Installed on 2014-09-09 (380 days ago)
InstallationMedia: Ubuntu-Server 14.04.1 LTS "Trusty Tahr" - Release amd64 (20140722.3)
IwConfig:
eth0 no wireless extensions.
eth1 no wireless extensions.
lo no wireless extensions.
Lspci:
Lsusb: Error: command ['lsusb'] failed with exit code 1: unable to initialize libusb: -99
MachineType: Microsoft Corporation Virtual Machine
Package: linux (not installed)
PciMultimedia:
ProcFB: 0 hyperv_fb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-63-generic.efi.signed root=UUID=f4d228d6-2eee-40fc-bf3f-633e46fa8301 ro slub_debug=U,kmalloc-512
ProcVersionSignature: Ubuntu 3.13.0-63.103-generic 3.13.11-ckt25
RelatedPackageVersions:
linux-restricted-modules-3.13.0-63-generic N/A
linux-backports-modules-3.13.0-63-generic N/A
linux-firmware 1.127.15
RfKill: Error: [Errno 2] No such file or directory
Tags: trusty
Uname: Linux 3.13.0-63-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:
WifiSyslog:
Sep 24 02:06:19 adm-backup1 dhclient: message repeated 1447 times: [ DHCPREQUEST of 10.40.128.9 on eth0 to 192.0.2.253 port 67 (xid=0x429dad4)]
Sep 24 02:06:37 adm-backup1 dhclient: DHCPREQUEST of 10.40.128.9 on eth0 to 255.255.255.255 port 67 (xid=0x429dad4)
Sep 24 02:06:37 adm-backup1 dhclient: DHCPACK of 10.40.128.9 from 192.0.2.253
Sep 24 02:06:37 adm-backup1 dhclient: bound to 10.40.128.9 -- renewal in 44877 seconds.
_MarkForUpload: True
dmi.bios.date: 11/26/2012
dmi.bios.vendor: Microsoft Corporation
dmi.bios.version: Hyper-V UEFI Release v1.0
dmi.board.asset.tag: None
dmi.board.name: Virtual Machine
dmi.board.vendor: Microsoft Corporation
dmi.board.version: Hyper-V UEFI Release v1.0
dmi.chassis.asset.tag: 6126-4244-1659-0314-3158-3955-44
dmi.chassis.type: 3
dmi.chassis.vendor: Microsoft Corporation
dmi.chassis.version: Hyper-V UEFI Release v1.0
dmi.modalias: dmi:bvnMicrosoftCorporation:bvrHyper-VUEFIReleasev1.0:bd11/26/2012:svnMicrosoftCorporation:pnVirtualMachine:pvrHyper-VUEFIReleasev1.0:rvnMicrosoftCorporation:rnVirtualMachine:rvrHyper-VUEFIReleasev1.0:cvnMicrosoftCorporation:ct3:cvrHyper-VUEFIReleasev1.0:
dmi.product.name: Virtual Machine
dmi.product.version: Hyper-V UEFI Release v1.0
dmi.sys.vendor: Microsoft Corporation
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1499203/+subscriptions
References