group.of.nepali.translators team mailing list archive
-
group.of.nepali.translators team
-
Mailing list archive
-
Message #39038
[Bug 1921211] [NEW] Taking a memory dump of user mode process on Xenial hosts causes bugcheck/kernel panic and core dump
Public bug reported:
[Impact]
We have some Ubuntu 16.04 hosts (in Hyper-V) being used for testing some Ubuntu 20.04 container. As part of the testing we were attempting to take a memory dump of a container running SQL Server with Ubuntu 20.04 on the Ubuntu 16.04 host we started seeing kernel panic and core dump. It started happening after a specific Xenial kernel update on the host.
4.4.0-204-generic - Systems that are crashing
4.4.0-201-generic - Systems that are able to capture dump
Note from the developer indicates following logging showing up.
----
Now the following is output right after I attempt to start the dump. (gdb, attach ###, generate-core-file /var/opt/mssql/log/rdorr.delme.core)
[Fri Mar 19 20:01:38 2021] systemd-journald[581]: Successfully sent stream file descriptor to service manager.
[Fri Mar 19 20:01:41 2021] cni0: port 9(vethdec5d2b7) entered forwarding state
[Fri Mar 19 20:02:42 2021] systemd-journald[581]: Successfully sent stream file descriptor to service manager.
[Fri Mar 19 20:03:04 2021] ------------[ cut here ]------------
[Fri Mar 19 20:03:04 2021] kernel BUG at /build/linux-qlAbvR/linux-4.4.0/mm/memory.c:3214!
[Fri Mar 19 20:03:04 2021] invalid opcode: 0000 [#1] SMP
[Fri Mar 19 20:03:04 2021] Modules linked in: veth vxlan ip6_udp_tunnel udp_tunnel xt_statistic xt_nat ipt_REJECT nf_reject_ipv4 xt_tcpudp ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs libcrc32c ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6_tables xt_comment xt_mark xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables br_netfilter bridge stp llc aufs overlay nls_utf8 isofs crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd input_leds serio_raw i2c_piix4 hv_balloon hyperv_fb 8250_fintek joydev mac_hid autofs4 hid_generic hv_utils hid_hyperv ptp hv_netvsc hid hv_storvsc pps_core
[Fri Mar 19 20:03:04 2021] hyperv_keyboard scsi_transport_fc psmouse pata_acpi hv_vmbus floppy fjes
[Fri Mar 19 20:03:04 2021] CPU: 1 PID: 24869 Comm: gdb Tainted: G W 4.4.0-204-generic #236-Ubuntu
[Fri Mar 19 20:03:04 2021] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 05/18/2018
[Fri Mar 19 20:03:04 2021] task: ffff880db9229c80 ti: ffff880d93b9c000 task.ti: ffff880d93b9c000
[Fri Mar 19 20:03:04 2021] RIP: 0010:[<ffffffff811cd93e>] [<ffffffff811cd93e>] handle_mm_fault+0x13de/0x1b80
[Fri Mar 19 20:03:04 2021] RSP: 0018:ffff880d93b9fc28 EFLAGS: 00010246
[Fri Mar 19 20:03:04 2021] RAX: 0000000000000100 RBX: 0000000000000000 RCX: 0000000000000120
[Fri Mar 19 20:03:04 2021] RDX: ffff880ea635f3e8 RSI: 00003ffffffff000 RDI: 0000000000000000
[Fri Mar 19 20:03:04 2021] RBP: ffff880d93b9fce8 R08: 00003ff32179a120 R09: 000000000000007d
[Fri Mar 19 20:03:04 2021] R10: ffff8800000003e8 R11: 00000000000003e8 R12: ffff8800ea672708
[Fri Mar 19 20:03:04 2021] R13: 0000000000000000 R14: 000000010247d000 R15: ffff8800f27fe400
[Fri Mar 19 20:03:04 2021] FS: 00007fdc26061600(0000) GS:ffff881025640000(0000) knlGS:0000000000000000
[Fri Mar 19 20:03:04 2021] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Fri Mar 19 20:03:04 2021] CR2: 000055e3a0011290 CR3: 0000000d93ba4000 CR4: 0000000000160670
[Fri Mar 19 20:03:04 2021] Stack:
[Fri Mar 19 20:03:04 2021] ffffffff81082929 fffffffffffffffd ffffffff81082252 ffff880d93b9fca8
[Fri Mar 19 20:03:04 2021] ffffffff811c7bca ffff8800f27fe400 000000010247d000 ffff880e74a88090
[Fri Mar 19 20:03:04 2021] 000000003a98d7f0 ffff880e00000001 ffff8800000003e8 0000000000000017
[Fri Mar 19 20:03:04 2021] Call Trace:
[Fri Mar 19 20:03:04 2021] [<ffffffff81082929>] ? mm_access+0x79/0xa0
[Fri Mar 19 20:03:04 2021] [<ffffffff81082252>] ? mmput+0x12/0x130
[Fri Mar 19 20:03:04 2021] [<ffffffff811c7bca>] ? follow_page_pte+0x1ca/0x3d0
[Fri Mar 19 20:03:04 2021] [<ffffffff811c7fe4>] ? follow_page_mask+0x214/0x3a0
[Fri Mar 19 20:03:04 2021] [<ffffffff811c82a0>] __get_user_pages+0x130/0x680
[Fri Mar 19 20:03:04 2021] [<ffffffff8122b248>] ? path_openat+0x348/0x1360
[Fri Mar 19 20:03:04 2021] [<ffffffff811c8b74>] get_user_pages+0x34/0x40
[Fri Mar 19 20:03:04 2021] [<ffffffff811c90f4>] __access_remote_vm+0xe4/0x2d0
[Fri Mar 19 20:03:04 2021] [<ffffffff811ef6ac>] ? alloc_pages_current+0x8c/0x110
[Fri Mar 19 20:03:04 2021] [<ffffffff811cfe3f>] access_remote_vm+0x1f/0x30
[Fri Mar 19 20:03:04 2021] [<ffffffff8128d3fa>] mem_rw.isra.16+0xfa/0x190
[Fri Mar 19 20:03:04 2021] [<ffffffff8128d4c8>] mem_read+0x18/0x20
[Fri Mar 19 20:03:04 2021] [<ffffffff8121c89b>] __vfs_read+0x1b/0x40
[Fri Mar 19 20:03:04 2021] [<ffffffff8121d016>] vfs_read+0x86/0x130
[Fri Mar 19 20:03:04 2021] [<ffffffff8121df65>] SyS_pread64+0x95/0xb0
[Fri Mar 19 20:03:04 2021] [<ffffffff8186acdb>] entry_SYSCALL_64_fastpath+0x22/0xd0
[Fri Mar 19 20:03:04 2021] Code: d4 ee ff ff 48 8b 7d 98 89 45 88 e8 2d c7 fd ff 8b 45 88 89 c3 e9 be ee ff ff 48 8b bd 70 ff ff ff e8 c7 cf 69 00 e9 ad ee ff ff <0f> 0b 4c 89 e7 4c 89 9d 70 ff ff ff e8 f1 c9 00 00 85 c0 4c 8b
[Fri Mar 19 20:03:04 2021] RIP [<ffffffff811cd93e>] handle_mm_fault+0x13de/0x1b80
[Fri Mar 19 20:03:04 2021] RSP <ffff880d93b9fc28>
[Fri Mar 19 20:03:04 2021] ---[ end trace 9d28a7e662aea7df ]---
[Fri Mar 19 20:03:04 2021] systemd-journald[581]: Compressed data object 806 -> 548 using XZ
------------------------
We think the following code may be relevant to the crashing behavior.
I think this is the relevant source for Ubuntu 4.4.0-204 (BTW, are you sure this is Ubuntu 20.04? 4.4.0 is a Xenial kernel):
memory.c\mm - ~ubuntu-kernel/ubuntu/+source/linux/+git/xenial - [no description] (launchpad.net)
static int do_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
unsigned long addr, pte_t pte, pte_t *ptep, pmd_t *pmd)
{
...
/* A PROT_NONE fault should not end up here */
BUG_ON(!(vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE))); Line 3214
We see following fix but we are not certain if it's relevant yet.
This is interesting… mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing · torvalds/linux@38e0885 · GitHub
mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing · torvalds/linux@38e0885 · GitHub
mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing
The NUMA balancing logic uses an arch-specific PROT_NONE page table flag
defined by pte_protnone() or pmd_protnone() to mark PTEs or huge page
PMDs respectively as requiring balancing upon a subsequent page fault.
User-defined PROT_NONE memory regions which also have this flag set will
not normally invoke the NUMA balancing code as do_page_fault() will send
a segfault to the process before handle_mm_fault() is even called.
However if access_remote_vm() is invoked to access a PROT_NONE region of
memory, handle_mm_fault() is called via faultin_page() and
__get_user_pages() without any access checks being performed, meaning
the NUMA balancing logic is incorrectly invoked on a non-NUMA memory
region.
A simple means of triggering this problem is to access PROT_NONE mmap'd
memory using /proc/self/mem which reliably results in the NUMA handling
functions being invoked when CONFIG_NUMA_BALANCING is set.
This issue was reported in bugzilla (issue 99101) which includes some
simple repro code.
There are BUG_ON() checks in do_numa_page() and do_huge_pmd_numa_page()
added at commit c0e7cad to avoid accidentally provoking strange
behavior by attempting to apply NUMA balancing to pages that are in
fact PROT_NONE. The BUG_ON()'s are consistently triggered by the repro.
This patch moves the PROT_NONE check into mm/memory.c rather than
invoking BUG_ON() as faulting in these pages via faultin_page() is a
valid reason for reaching the NUMA check with the PROT_NONE page table
flag set and is therefore not always a bug.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=99101
We need help in understanding how to prevent core dump/kernel panic
while taking memory dump of a focal container on a xenial host.
[Test Plan]
Testing on an 16.04 Azure instance, follow the steps:
$ echo 'GRUB_FLAVOUR_ORDER="generic"' | sudo tee -a
/etc/default/grub.d/99-custom.cfg
$ sudo apt install linux-generic
$ sudo reboot
# login again and confirm the system is booted with the 4.4 kernel
$ sudo apt install docker.io gdb
$ sudo docker pull mcr.microsoft.com/mssql/server:2019-latest
$ sudo docker run -e "ACCEPT_EULA=Y" -e "SA_PASSWORD=<YourStrong@Passw0rd>" \
-p 1433:1433 --name sql1 -h sql1 \
-d mcr.microsoft.com/mssql/server:2019-latest
$ps -ef | grep sqlservr
sudo gdb -p $PID -ex generate-core-file
# A kernel BUG should be triggered
[Where problems could occur]
The patches touches the mm subsystem and because of that there's always
the potential for significant regressions and in this case a revert and
a re-spin would probably be necessary.
On the other hand however, this patch is included into the mainline
kernel since 4.8 without problems.
** Affects: linux (Ubuntu)
Importance: Undecided
Status: Incomplete
** Affects: linux (Ubuntu Xenial)
Importance: Undecided
Status: In Progress
** Tags: xenial
** Also affects: linux (Ubuntu Xenial)
Importance: Undecided
Status: New
** Changed in: linux (Ubuntu Xenial)
Status: New => In Progress
--
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1921211
Title:
Taking a memory dump of user mode process on Xenial hosts causes
bugcheck/kernel panic and core dump
Status in linux package in Ubuntu:
Incomplete
Status in linux source package in Xenial:
In Progress
Bug description:
[Impact]
We have some Ubuntu 16.04 hosts (in Hyper-V) being used for testing some Ubuntu 20.04 container. As part of the testing we were attempting to take a memory dump of a container running SQL Server with Ubuntu 20.04 on the Ubuntu 16.04 host we started seeing kernel panic and core dump. It started happening after a specific Xenial kernel update on the host.
4.4.0-204-generic - Systems that are crashing
4.4.0-201-generic - Systems that are able to capture dump
Note from the developer indicates following logging showing up.
----
Now the following is output right after I attempt to start the dump. (gdb, attach ###, generate-core-file /var/opt/mssql/log/rdorr.delme.core)
[Fri Mar 19 20:01:38 2021] systemd-journald[581]: Successfully sent stream file descriptor to service manager.
[Fri Mar 19 20:01:41 2021] cni0: port 9(vethdec5d2b7) entered forwarding state
[Fri Mar 19 20:02:42 2021] systemd-journald[581]: Successfully sent stream file descriptor to service manager.
[Fri Mar 19 20:03:04 2021] ------------[ cut here ]------------
[Fri Mar 19 20:03:04 2021] kernel BUG at /build/linux-qlAbvR/linux-4.4.0/mm/memory.c:3214!
[Fri Mar 19 20:03:04 2021] invalid opcode: 0000 [#1] SMP
[Fri Mar 19 20:03:04 2021] Modules linked in: veth vxlan ip6_udp_tunnel udp_tunnel xt_statistic xt_nat ipt_REJECT nf_reject_ipv4 xt_tcpudp ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs libcrc32c ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6_tables xt_comment xt_mark xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables br_netfilter bridge stp llc aufs overlay nls_utf8 isofs crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd input_leds serio_raw i2c_piix4 hv_balloon hyperv_fb 8250_fintek joydev mac_hid autofs4 hid_generic hv_utils hid_hyperv ptp hv_netvsc hid hv_storvsc pps_core
[Fri Mar 19 20:03:04 2021] hyperv_keyboard scsi_transport_fc psmouse pata_acpi hv_vmbus floppy fjes
[Fri Mar 19 20:03:04 2021] CPU: 1 PID: 24869 Comm: gdb Tainted: G W 4.4.0-204-generic #236-Ubuntu
[Fri Mar 19 20:03:04 2021] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 05/18/2018
[Fri Mar 19 20:03:04 2021] task: ffff880db9229c80 ti: ffff880d93b9c000 task.ti: ffff880d93b9c000
[Fri Mar 19 20:03:04 2021] RIP: 0010:[<ffffffff811cd93e>] [<ffffffff811cd93e>] handle_mm_fault+0x13de/0x1b80
[Fri Mar 19 20:03:04 2021] RSP: 0018:ffff880d93b9fc28 EFLAGS: 00010246
[Fri Mar 19 20:03:04 2021] RAX: 0000000000000100 RBX: 0000000000000000 RCX: 0000000000000120
[Fri Mar 19 20:03:04 2021] RDX: ffff880ea635f3e8 RSI: 00003ffffffff000 RDI: 0000000000000000
[Fri Mar 19 20:03:04 2021] RBP: ffff880d93b9fce8 R08: 00003ff32179a120 R09: 000000000000007d
[Fri Mar 19 20:03:04 2021] R10: ffff8800000003e8 R11: 00000000000003e8 R12: ffff8800ea672708
[Fri Mar 19 20:03:04 2021] R13: 0000000000000000 R14: 000000010247d000 R15: ffff8800f27fe400
[Fri Mar 19 20:03:04 2021] FS: 00007fdc26061600(0000) GS:ffff881025640000(0000) knlGS:0000000000000000
[Fri Mar 19 20:03:04 2021] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Fri Mar 19 20:03:04 2021] CR2: 000055e3a0011290 CR3: 0000000d93ba4000 CR4: 0000000000160670
[Fri Mar 19 20:03:04 2021] Stack:
[Fri Mar 19 20:03:04 2021] ffffffff81082929 fffffffffffffffd ffffffff81082252 ffff880d93b9fca8
[Fri Mar 19 20:03:04 2021] ffffffff811c7bca ffff8800f27fe400 000000010247d000 ffff880e74a88090
[Fri Mar 19 20:03:04 2021] 000000003a98d7f0 ffff880e00000001 ffff8800000003e8 0000000000000017
[Fri Mar 19 20:03:04 2021] Call Trace:
[Fri Mar 19 20:03:04 2021] [<ffffffff81082929>] ? mm_access+0x79/0xa0
[Fri Mar 19 20:03:04 2021] [<ffffffff81082252>] ? mmput+0x12/0x130
[Fri Mar 19 20:03:04 2021] [<ffffffff811c7bca>] ? follow_page_pte+0x1ca/0x3d0
[Fri Mar 19 20:03:04 2021] [<ffffffff811c7fe4>] ? follow_page_mask+0x214/0x3a0
[Fri Mar 19 20:03:04 2021] [<ffffffff811c82a0>] __get_user_pages+0x130/0x680
[Fri Mar 19 20:03:04 2021] [<ffffffff8122b248>] ? path_openat+0x348/0x1360
[Fri Mar 19 20:03:04 2021] [<ffffffff811c8b74>] get_user_pages+0x34/0x40
[Fri Mar 19 20:03:04 2021] [<ffffffff811c90f4>] __access_remote_vm+0xe4/0x2d0
[Fri Mar 19 20:03:04 2021] [<ffffffff811ef6ac>] ? alloc_pages_current+0x8c/0x110
[Fri Mar 19 20:03:04 2021] [<ffffffff811cfe3f>] access_remote_vm+0x1f/0x30
[Fri Mar 19 20:03:04 2021] [<ffffffff8128d3fa>] mem_rw.isra.16+0xfa/0x190
[Fri Mar 19 20:03:04 2021] [<ffffffff8128d4c8>] mem_read+0x18/0x20
[Fri Mar 19 20:03:04 2021] [<ffffffff8121c89b>] __vfs_read+0x1b/0x40
[Fri Mar 19 20:03:04 2021] [<ffffffff8121d016>] vfs_read+0x86/0x130
[Fri Mar 19 20:03:04 2021] [<ffffffff8121df65>] SyS_pread64+0x95/0xb0
[Fri Mar 19 20:03:04 2021] [<ffffffff8186acdb>] entry_SYSCALL_64_fastpath+0x22/0xd0
[Fri Mar 19 20:03:04 2021] Code: d4 ee ff ff 48 8b 7d 98 89 45 88 e8 2d c7 fd ff 8b 45 88 89 c3 e9 be ee ff ff 48 8b bd 70 ff ff ff e8 c7 cf 69 00 e9 ad ee ff ff <0f> 0b 4c 89 e7 4c 89 9d 70 ff ff ff e8 f1 c9 00 00 85 c0 4c 8b
[Fri Mar 19 20:03:04 2021] RIP [<ffffffff811cd93e>] handle_mm_fault+0x13de/0x1b80
[Fri Mar 19 20:03:04 2021] RSP <ffff880d93b9fc28>
[Fri Mar 19 20:03:04 2021] ---[ end trace 9d28a7e662aea7df ]---
[Fri Mar 19 20:03:04 2021] systemd-journald[581]: Compressed data object 806 -> 548 using XZ
------------------------
We think the following code may be relevant to the crashing behavior.
I think this is the relevant source for Ubuntu 4.4.0-204 (BTW, are you sure this is Ubuntu 20.04? 4.4.0 is a Xenial kernel):
memory.c\mm - ~ubuntu-kernel/ubuntu/+source/linux/+git/xenial - [no description] (launchpad.net)
static int do_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
unsigned long addr, pte_t pte, pte_t *ptep, pmd_t *pmd)
{
...
/* A PROT_NONE fault should not end up here */
BUG_ON(!(vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE))); Line 3214
We see following fix but we are not certain if it's relevant yet.
This is interesting… mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing · torvalds/linux@38e0885 · GitHub
mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing · torvalds/linux@38e0885 · GitHub
mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing
The NUMA balancing logic uses an arch-specific PROT_NONE page table flag
defined by pte_protnone() or pmd_protnone() to mark PTEs or huge page
PMDs respectively as requiring balancing upon a subsequent page fault.
User-defined PROT_NONE memory regions which also have this flag set will
not normally invoke the NUMA balancing code as do_page_fault() will send
a segfault to the process before handle_mm_fault() is even called.
However if access_remote_vm() is invoked to access a PROT_NONE region of
memory, handle_mm_fault() is called via faultin_page() and
__get_user_pages() without any access checks being performed, meaning
the NUMA balancing logic is incorrectly invoked on a non-NUMA memory
region.
A simple means of triggering this problem is to access PROT_NONE mmap'd
memory using /proc/self/mem which reliably results in the NUMA handling
functions being invoked when CONFIG_NUMA_BALANCING is set.
This issue was reported in bugzilla (issue 99101) which includes some
simple repro code.
There are BUG_ON() checks in do_numa_page() and do_huge_pmd_numa_page()
added at commit c0e7cad to avoid accidentally provoking strange
behavior by attempting to apply NUMA balancing to pages that are in
fact PROT_NONE. The BUG_ON()'s are consistently triggered by the repro.
This patch moves the PROT_NONE check into mm/memory.c rather than
invoking BUG_ON() as faulting in these pages via faultin_page() is a
valid reason for reaching the NUMA check with the PROT_NONE page table
flag set and is therefore not always a bug.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=99101
We need help in understanding how to prevent core dump/kernel panic
while taking memory dump of a focal container on a xenial host.
[Test Plan]
Testing on an 16.04 Azure instance, follow the steps:
$ echo 'GRUB_FLAVOUR_ORDER="generic"' | sudo tee -a
/etc/default/grub.d/99-custom.cfg
$ sudo apt install linux-generic
$ sudo reboot
# login again and confirm the system is booted with the 4.4 kernel
$ sudo apt install docker.io gdb
$ sudo docker pull mcr.microsoft.com/mssql/server:2019-latest
$ sudo docker run -e "ACCEPT_EULA=Y" -e "SA_PASSWORD=<YourStrong@Passw0rd>" \
-p 1433:1433 --name sql1 -h sql1 \
-d mcr.microsoft.com/mssql/server:2019-latest
$ps -ef | grep sqlservr
sudo gdb -p $PID -ex generate-core-file
# A kernel BUG should be triggered
[Where problems could occur]
The patches touches the mm subsystem and because of that there's
always the potential for significant regressions and in this case a
revert and a re-spin would probably be necessary.
On the other hand however, this patch is included into the mainline
kernel since 4.8 without problems.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1921211/+subscriptions
Follow ups