group.of.nepali.translators team mailing list archive
-
group.of.nepali.translators team
-
Mailing list archive
-
Message #15585
[Bug 1655625] Re: ISST-LTE:pVM:roselp4:ubuntu 16.04.2: vmcore cannot be analysed by crash
** Changed in: makedumpfile (Ubuntu)
Assignee: Nish Aravamudan (nacc) => (unassigned)
** Changed in: makedumpfile (Ubuntu)
Status: Confirmed => Fix Released
--
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1655625
Title:
ISST-LTE:pVM:roselp4:ubuntu 16.04.2: vmcore cannot be analysed by
crash
Status in The Ubuntu-power-systems project:
Fix Committed
Status in crash package in Ubuntu:
Fix Released
Status in makedumpfile package in Ubuntu:
Fix Released
Status in crash source package in Xenial:
Fix Released
Status in makedumpfile source package in Xenial:
Fix Released
Bug description:
[SRU justification]
This fix is required to make the crash tool usable. It does also improve makedumpfile filtering of pages.
[Impact]
Kernel crashes cannot be analysed with the crash tool.
makedumpfile incorrectly filter pages.
[Fix]
Cherry-pick upstream commits fixing those issues.
[Test Case]
Running crash tool on a kernel crash file will display something like :
# crash -s usr/lib/debug/boot/vmlinux-4.8.0-34-generic
crash: read error: kernel virtual address: ffffffff81e29ff0 type: "pv_init_ops"
crash: this kernel may be configured with CONFIG_STRICT_DEVMEM, which
renders /dev/mem unusable as a live memory source.
crash: trying /proc/kcore as an alternative to /dev/mem
crash: seek error: kernel virtual address: ffffffff81e29ff0 type: "pv_init_ops"
crash: seek error: kernel virtual address: ffffffff82166130 type: "shadow_timekeeper xtime_sec"
crash: seek error: kernel virtual address: ffffffff81e0d304 type: "init_uts_ns"
crash: usr/lib/debug/boot/vmlinux-4.8.0-34-generic and /var/crash/201701191308/dump.201701191308 do not match!
With the fix, the crash command will work as expected
Running the crash tool on a vmcore file produced by makedumpfile may
return :
crash: page excluded: kernel virtual address: <> type:
"fill_task_struct"
[Regression]
None expected as those modifications are part of the Zesty and upstream version.
The makedumpfile patches are in Yakkety and Zesty 1.6.0 & after
[Original description of the problem]
vmcore captured by kdump cannot be opened with crash:
% sudo crash -d1 /usr/lib/debug/boot/vmlinux-4.8.0-34-generic /var/crash/201612282137/dump.201612282137
... ...
base kernel version: 0.8.0
linux_banner:
????????
crash: /usr/lib/debug/boot/vmlinux-4 and /var/crash/201612282137/dump.201612282137 do not match!
Usage:
crash [OPTION]... NAMELIST MEMORY-IMAGE[@ADDRESS] (dumpfile form)
crash [OPTION]... [NAMELIST] (live system form)
Enter "crash -h" for details.
Looks like the 'linux_banner' cannot be understood by crash.
And when the vmcore was dumping, this message being showed:
[ 729.609196] kdump-tools[5192]: The kernel version is not supported.
[ 729.609447] kdump-tools[5192]: The makedumpfile operation may be incomplete.
---uname output---
Linux roselp4 4.8.0-34-generic #36~16.04.1-Ubuntu SMP Wed Dec 21 18:53:20 UTC 2016 ppc64le ppc64le ppc64le GNU/Linux
Machine Type = lpar
---Debugger---
A debugger is not configured
---Steps to Reproduce---
1. config kdump
2. trigger kdump
3. analyse vmcore with crash
Userspace tool common name: crash/makedumpfile
The userspace tool has the following bit modes: 64-bit
Userspace rpm: makedumpfile 1.5.9-5ubuntu0.3/crash 7.1.4-1ubuntu4
Userspace tool obtained from project website: na
*Additional Instructions for Ping Tian Han/pthan@xxxxxxxxxx:
-Post a private note with access information to the machine that the bug is occuring on.
-Attach ltrace and strace of userspace application.
xtime timespec.tv_sec: 586481e8: Wed Dec 28 21:24:24 2016
utsname:
sysname: Linux
nodename: boblp1
release: 4.8.0-32-generic
version: #34~16.04.1-Ubuntu SMP Tue Dec 13 17:01:57 UTC 2016
machine: ppc64le
domainname: (none)
base kernel version: 4.8.0
verify_namelist:
dumpfile /proc/version:
Linux version 4.8.0-32-generic (buildd@bos01-ppc64el-001) (gcc version 5.4.0 20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #34~16.04.1-Ubuntu SMP Tue Dec 13 17:01:57 UTC 2016 (Ubuntu 4.8.0-32.34~16.04.1-generic 4.8.11)
/usr/lib/debug/boot/vmlinux-4.8.0-32-generic:
Linux version 4.8.0-32-generic (buildd@bos01-ppc64el-001) (gcc version 5.4.0 20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #34~16.04.1-Ubuntu SMP Tue Dec 13 17:01:57 UTC 2016 (Ubuntu 4.8.0-32.34~16.04.1-generic 4.8.11)
hypervisor: (undetermined)
crash: per_cpu_symbol_search(per_cpu__tvec_bases): NULL
ppc64_vmemmap_init: vmemmap base: f000000000000000
crash: PPC64: cannot find 'cpu_possible_map', 'cpu_present_map',
'cpu_online_map' or 'cpu_active_map' symbols
root@boblp1:/usr/lib/debug/boot# uname -a
Linux boblp1 4.8.0-32-generic #34~16.04.1-Ubuntu SMP Tue Dec 13 17:01:57 UTC 2016 ppc64le ppc64le ppc64le GNU/Linux
root@boblp1:/usr/lib/debug/boot#
1. Missing v4.8 support related patches in crash tool
commit 098cdab16dfa6a85e9dad2cad604dee14ee15f66
Author: Dave Anderson <anderson@xxxxxxxxxx>
Date: Fri Feb 12 14:32:53 2016 -0500
Fix for the changes made to the kernel module structure introduced by
this kernel commit for Linux 4.5 and later kernels:
commit 8244062ef1e54502ef55f54cced659913f244c3e
modules: fix longstanding /proc/kallsyms vs module insertion race.
Without the patch, the crash session fails during initialization
with the error message: "crash: invalid structure member offset:
module_num_symtab".
(anderson@xxxxxxxxxx)
commit 6f1f78e33474d00d5f261d7ed9d835c558b34d61
Author: Dave Anderson <anderson@xxxxxxxxxx>
Date: Wed Jan 20 09:56:36 2016 -0500
Fix for the changes made to the kernel module structure introduced by
this kernel commit for Linux 4.5 and later kernels:
commit 7523e4dc5057e157212b4741abd6256e03404cf1
module: use a structure to encapsulate layout.
Without the patch, the crash session fails during initialization
with the error message: "crash: invalid structure member offset:
module_init_text_size".
(sebott@xxxxxxxxxxxxxxxxxx)
commit 1e92f9fad3a7e3042b16996306cb2335760ef8c8
Author: Dave Anderson <anderson@xxxxxxxxxx>
Date: Mon Feb 1 16:10:49 2016 -0500
Fix for the replacements made to the kernel's cpu_possible_mask,
cpu_online_mask, cpu_present_mask and cpu_active_mask symbols in
this kernel commit for Linux 4.5 and later kernels:
commit 5aec01b834fd6f8ca49d1aeede665b950d0c148e
kernel/cpu.c: eliminate cpu_*_mask
Without the patch, behavior is architecture-specific, dependent upon
whether the cpu mask values are used to calculate the number of cpus.
For example, ARM64 crash sessions fail during session initialization
with the error message "crash: zero-size memory allocation! (called
from <address>)", whereas X86_64 sessions come up normally, but
cpu mask values of zero are stored internally.
(anderson@xxxxxxxxxx)
commit 182914debbb9a2671ef644027fedd339aa9c80e0
Author: Dave Anderson <anderson@xxxxxxxxxx>
Date: Fri Sep 23 09:09:15 2016 -0400
With the introduction of radix MMU in Power ISA 3.0, there are
changes in kernel page table management accommodating it. This patch
series makes appropriate changes here to work for such kernels.
Also, this series fixes a few bugs along the way:
ppc64: fix vtop page translation for 4K pages
ppc64: Use kernel terminology for each level in 4-level page table
ppc64/book3s: address changes in kernel v4.5
ppc64/book3s: address change in page flags for PowerISA v3.0
ppc64: use physical addresses and unfold pud for 64K page size
ppc64/book3s: support big endian Linux page tables
The patches are needed for Linux v4.5 and later kernels on all
ppc64 hardware.
commit 8ceb1ac628bf6a0a7f0bbfff030ec93081bca4cd
Author: Dave Anderson <anderson@xxxxxxxxxx>
Date: Mon May 23 11:23:01 2016 -0400
Fix for Linux commit 0139aa7b7fa12ceef095d99dc36606a5b10ab83a, which
renamed the page._count member to page._refcount. Without the patch,
certain "kmem" commands fail with the "kmem: invalid structure member
offset: page_count".
(anderson@xxxxxxxxxx)
commit 7136bf8495948cb059e5595b8503f8ae37019fa1
Author: Dave Anderson <anderson@xxxxxxxxxx>
Date: Thu May 19 14:01:19 2016 -0400
Fix for Linux commit edf14cdbf9a0e5ab52698ca66d07a76ade0d5c46, which
has appended a NULL entry as the final member of the pageflag_names[]
array. Without the patch, a message that indicates "crash: failed to
read pageflag_names entry" is displayed during session initialization
in Linux 4.6 kernels.
(andrej.skvortzov@xxxxxxxxx)
2. The following makedumpfile commits are needed:
commit 5bc1f520cc7ab6e18abdd5af21c80ecda6339eb5
Author: Atsushi Kumagai <ats-kumagai@xxxxxxxxxxxxx>
Date: Tue Jan 26 10:11:33 2016 +0900
[PATCH] Looking for page.compound_order/compound_dtor to exclude
hugepages
* Required for kernel 4.4
Due to some changes in struct page, hugepages wouldn't be removed on
linux 4.4. makedumpfile reads page.lru.prev to get "order" (number of hugepages)
and page.lru.next to get "dtor" (destructor for hugepages) to detect hugepages,
but the offsets of the two was changed in linux 4.4.
kernel version | where is order | where is dtor
----------------+---------------------------+---------------------------
- v3.19 | lru.prev | lru.next
v4.0 - v4.3 | compound_order(=lru.prev) | compound_dtor(=lru.next)
v4.4 - | compound_order | compound_dtor
As above, OFFSET(page.compound_order) and OFFSET(page.compound_dtor) are
definitely necessary in VMCOREINFO on linux 4.4 and later.
Further, the content of page.compound_dtor was changed from direct address
of dtor to the ID of it in linux 4.4.
Signed-off-by: Atsushi Kumagai <ats-kumagai@xxxxxxxxxxxxx>
commit 13b4233e91a9d5aa14c4b0643af36cbc29b9fa7a
Author: Atsushi Kumagai <ats-kumagai@xxxxxxxxxxxxx>
Date: Wed Feb 24 17:09:44 2016 +0900
[PATCH] Skip examining compound tail pages
* Required for kernel 4.5
For filtering user pages, we check whether each page's
page->mapping have PAGE_MAPPING_ANON bit.
However, unexcludable compound tail pages can have
PAGE_MAPPING_ANON since kernel 4.5, they can be excluded
as user page wrong.
Now, we don't need to check compound tail pages because
excludable compound pages must be excluded at a time by
exclude_range() when the corresponding head page is checked.
So just skipping tail pages can avoid wrong filtering.
Signed-off-by: Atsushi Kumagai <ats-kumagai@xxxxxxxxxxxxx>
3. The linux-image dbgsym version installed must be pulled from a different repo
instead of the one meant for 16.04.2 because the gcc version of kernel
image (/boot/vmlinux-4.8.0-34-generic) and the vmlinux with debug
symbols(usr/lib/debug/boot/vmlinux-4.8.0-34-generic) don't match.
Please use the following repos
sudo tee /etc/apt/sources.list.d/ddebs.list << EOF
deb http://ddebs.ubuntu.com/ $(lsb_release -cs) main restricted universe multiverse
deb http://ddebs.ubuntu.com/ $(lsb_release -cs)-security main restricted universe multiverse
deb http://ddebs.ubuntu.com/ $(lsb_release -cs)-updates main restricted universe multiverse
deb http://ddebs.ubuntu.com/ $(lsb_release -cs)-proposed main restricted universe multiverse
EOF
to install linux-image-4.8.0-34-generic-dbgsym package.
Thanks
[snip]
>
> 3. The linux-image dbgsym version installed must be pulled from a different
> repo
s/must be pulled/must have been pulled/
Applied crash utility's missing patches on top of
crash-7.1.4-1ubuntu4 and makedumpfile tool's missing patches on top of
makedumpfile-1.5.9-5ubuntu0.3. Did some sanity testing of the
patched binaries. The binaries were working as expected.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1655625/+subscriptions