group.of.nepali.translators team mailing list archive
-
group.of.nepali.translators team
-
Mailing list archive
-
Message #11220
[Bug 1641235] Re: Ubuntu 16.10: kdump over nfs did not generate complete vmcore
This bug was fixed in the package makedumpfile - 1:1.5.9-5ubuntu0.4
---------------
makedumpfile (1:1.5.9-5ubuntu0.4) xenial; urgency=medium
* d/p/0006-PATCH-Support-newer-kernels.patch :
Support kernel versions up to 4.8 (LP: #1557751)
* Turn hardcoded timeo and retrans NFS options into parameters that
can be modified in /etc/default/kdump-tools. Also use the NFS defaults
(timeo=600, retrans=3) for these parameters. Make those values visible
in the 'show' command if NFS is configured (LP: #1641235)
* Complete support for kernel versions 4.8 and later :
d/p/0007-PATCH-Looking-for-page.compound_order-compound_dtor-.patch,
d/p/0008-PATCH-Skip-examining-compound-tail-pages.patch (LP: #1655625)
-- Louis Bouchard <louis.bouchard@xxxxxxxxxx> Wed, 11 Jan 2017
11:33:42 +0100
** Changed in: makedumpfile (Ubuntu Xenial)
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1641235
Title:
Ubuntu 16.10: kdump over nfs did not generate complete vmcore
Status in makedumpfile package in Ubuntu:
Fix Released
Status in makedumpfile source package in Trusty:
Confirmed
Status in makedumpfile source package in Xenial:
Fix Released
Status in makedumpfile source package in Yakkety:
Confirmed
Bug description:
== Comment: #0 - HARSHA THYAGARAJA - 2016-11-03 08:05:59 ==
---Problem Description---
kdump over nfs did not generate complete vmcore
---uname output---
Linux ltciofvtr-firestone1 4.8.0-26-generic #28-Ubuntu SMP Tue Oct 18 14:41:40 UTC 2016 ppc64le ppc64le ppc64le GNU/Linux
Machine Type = PowerNV (Baremetal) - Firestone
---Steps to Reproduce---
1. Setup NFS
2. Trigger crash: echo c > /proc/sysrq-trigger
== Comment: #6 - Kevin W. Rudd - 2016-11-04 16:30:49 ==
Hi Harsha.
It looks like the base kdump NFS functionality works just fine. The
known issue with makedumpfile is causing it to drop back to using "cp"
to transfer the entire, non-compressed /proc/vmcore image. That's a
rather large amount of data to send over to the remote server, and it
appears to be sending back an I/O error after the first 122G.
Further debug would need to be done to determine if this is a client-
side or server-side issue. I recommend first bringing your remote NFS
server up to the current release as it is currently a bit down-rev.
== Comment: #8 - HARSHA THYAGARAJA - 2016-11-10 02:02:31 ==
Hi Kevin,
I updated my peer to Ubuntu 16.10 and still saw the same observation.
A snippet of the problem at hand is pasted below.
[ 20.610748] kdump-tools[4559]: Starting kdump-tools: * Mounting NFS mountpoint 150.1.1.20:/home/tools ...
[ 53.400516] kdump-tools[4559]: * Dumping to NFS mountpoint 150.1.1.20:/home/tools/201611100158
[ 53.409242] kdump-tools[4559]: * running makedumpfile -c -d 31 /proc/vmcore /mnt/var/crash/9.47.84.18-201611100158/dump-incomplete
[ 53.526593] kdump-tools[4559]: get_mem_map: Can't distinguish the memory type.
[ 53.527154] kdump-tools[4559]: The kernel version is not supported.
[ 53.527488] kdump-tools[4559]: The makedumpfile operation may be incomplete.
[ 53.527813] kdump-tools[4559]: makedumpfile Failed.
[ 53.528117] kdump-tools[4559]: * kdump-tools: makedumpfile failed, falling back to 'cp'
[ 90.754092] kdump-tools[4559]: cp: error writing '/mnt/var/crash/9.47.84.18-201611100158/vmcore-incomplete': Input/output error
[ 90.754857] kdump-tools[4559]: * kdump-tools: failed to save vmcore in /mnt/var/crash/9.47.84.18-201611100158
[ 90.756155] kdump-tools[4559]: * running makedumpfile --dump-dmesg /proc/vmcore /mnt/var/crash/9.47.84.18-201611100158/dmesg.201611100158
[ 90.758731] kdump-tools[4559]: get_mem_map: Can't distinguish the memory type.
[ 90.759089] kdump-tools[4559]: The kernel version is not supported.
[ 90.759436] kdump-tools[4559]: The makedumpfile operation may be incomplete.
[ 90.759780] kdump-tools[4559]: makedumpfile Failed.
[ 90.760094] kdump-tools[4559]: * kdump-tools: makedumpfile --dump-dmesg failed. dmesg content will be unavailable
[ 90.760668] kdump-tools[4559]: * kdump-tools: failed to save dmesg content in /mnt/var/crash/9.47.84.18-201611100158
[ 90.846117] kdump-tools[4559]: Thu, 10 Nov 2016 01:59:56 -0500
[ 90.886629] kdump-tools[4559]: Failed to read reboot parameter file: No such file or directory
[ 90.887070] kdump-tools[4559]: Rebooting.
== Comment: #13 - Kevin W. Rudd - 2016-11-11 17:12:33 ==
I was able to replicate this with debugging at both the kdump client
and remote NFS server. The server was perfectly happy with the data
coming at it, and appeared to be processing a COMMIT request from the
client when the client shut down the connection.
Looking at the client-side logs after a failure showed that it was
logging "server ... not responding" messages, and bailed on the
connection within the span of just a few seconds.
This appears to be due to a very over-aggressive timeout being
specified in /usr/sbin/kdump-config:
mount -t nfs -o nolock -o tcp -o soft -o timeo=5 -o retrans=5 $NFS
$KDUMP_COREDIR
The timeo value is deciseconds, and "5" is far too aggressive for this
type of connection. From my observations, the COMMIT was not issued
until about 60G was transferred, and most remote servers will take a
lot longer than 5 tenths of a second to flush that amount of data and
respond to the COMMIT.
I'm not sure what problem specifying this timeo value was supposed to
address, but it would be better to leave the timeo value at its
default for a tcp connection (let the TCP protocol handle any
communication timeouts on its own). When I modified kdump-config to
use the default timeo of 600, the kdump process transferred the entire
vmcore without error.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1641235/+subscriptions