sts-sponsors team mailing list archive
-
sts-sponsors team
-
Mailing list archive
-
Message #01371
[Bug 1681909] Re: kdump is not captured in remote host when kdump over ssh is configured
makedumpfile merge to "1:1.6.6-2ubuntu1" sponsored in Eoan.
I appended the changelog to add the entry block[0] currently found in eoan-proposed that was missing to keep track of everything that has been done on the package:
Since it was made by cascardo before 1:1.6.5-1ubuntu3 exist.
Note:
- I didn't want this to be a blocker for this upload due to many factors, but cascardo/gpicolli, can you guys have a look before the feature freeze[1] at this lintian report[2], it would be awesome. It's good to make the code more modern, but debian packaging too, especially when time permit like now (devel release).
[0]
makedumpfile (1:1.6.5-1ubuntu3) eoan; urgency=medium
* debian/kdump-config.in:
- Add kdump retry/delay mechanism when dumping over network.
(LP: #1681909)
-- gpiccoli@xxxxxxxxxxxxx (Guilherme G. Piccoli) Thu, 04 Jul 2019
15:20:53 -0300
[1] - https://wiki.ubuntu.com/EoanErmine/ReleaseSchedule
[2] - https://pastebin.canonical.com/p/dWYkNhwjCb/
- Eric
--
You received this bug notification because you are a member of STS
Sponsors, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/1681909
Title:
kdump is not captured in remote host when kdump over ssh is configured
Status in The Ubuntu-power-systems project:
In Progress
Status in makedumpfile package in Ubuntu:
Fix Committed
Status in makedumpfile source package in Xenial:
Won't Fix
Status in makedumpfile source package in Bionic:
In Progress
Status in makedumpfile source package in Cosmic:
Won't Fix
Status in makedumpfile source package in Disco:
In Progress
Status in makedumpfile source package in Eoan:
Fix Committed
Bug description:
[Impact]
* Kdump over network (like NFS mount or SSH dump) relies on network-
online target from systemd. Even so, there are some NICs that report
"Link Up" state but aren't ready to transmit packets. This is a
generally bad behavior that is credited probably to NIC firmware
delays, usually not fixable from drivers. Some adapters known to act
like this are bnx2x, tg3 and ixgbe.
* Kdump is a mechanism that may be a last resort to debug complex/hard
to reproduce issues, so it's interesting to increase its reliability /
resilience. We then propose here a solution/quirk to this issue on
network dump by adding a retry/delay mechanism; if it's a network
dump, kdump will retry some times and sleep between the attempts in
order to exclude the case of NICs that aren't ready yet but will soon
be able to transmit packets.
* Although first reported by IBM in PowerPC arch, the scope for this
issue is the NIC, and it was later reported in x86 arch too.
[Test case]
Usually it's difficult to naturally reproduce this issue in a deterministic way, but we have an artificial test case on comment #24 of this LP.
Also, we have a report from this bug in which the user managed to reproduce the problem consistently - it's fixed after testing our solution.
[Regression potential]
There's not a clear regression potential here since it's just a retry/delay mechanism. Some potential problems may come from bad coding in the script.
The delay between attempts is only 3 sec per iteration, so it shouldn't block the kdump progress for a high amount of time at once.
[Other information]
Salsa Debian commit:
https://salsa.debian.org/debian/makedumpfile/commit/d63ba95337988be1eac8c8c76d90825ff5c6d17f
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1681909/+subscriptions