← Back to team overview

group.of.nepali.translators team mailing list archive

[Bug 1921211] [NEW] Taking a memory dump of user mode process on Xenial hosts causes bugcheck/kernel panic and core dump

 

Public bug reported:

[Impact]

We have some Ubuntu 16.04 hosts (in Hyper-V) being used for testing some Ubuntu 20.04 container. As part of the testing we were attempting to take a memory dump of a container running SQL Server with Ubuntu 20.04 on the Ubuntu 16.04 host we started seeing kernel panic and core dump. It started happening after a specific Xenial kernel update on the host.
4.4.0-204-generic - Systems that are crashing
4.4.0-201-generic - Systems that are able to capture dump


Note from the developer indicates following logging showing up.
----
Now the following is output right after I attempt to start the dump. (gdb, attach ###, generate-core-file /var/opt/mssql/log/rdorr.delme.core)

[Fri Mar 19 20:01:38 2021] systemd-journald[581]: Successfully sent stream file descriptor to service manager.
[Fri Mar 19 20:01:41 2021] cni0: port 9(vethdec5d2b7) entered forwarding state
[Fri Mar 19 20:02:42 2021] systemd-journald[581]: Successfully sent stream file descriptor to service manager.
[Fri Mar 19 20:03:04 2021] ------------[ cut here ]------------
[Fri Mar 19 20:03:04 2021] kernel BUG at /build/linux-qlAbvR/linux-4.4.0/mm/memory.c:3214!
[Fri Mar 19 20:03:04 2021] invalid opcode: 0000 [#1] SMP
[Fri Mar 19 20:03:04 2021] Modules linked in: veth vxlan ip6_udp_tunnel udp_tunnel xt_statistic xt_nat ipt_REJECT nf_reject_ipv4 xt_tcpudp ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs libcrc32c ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6_tables xt_comment xt_mark xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables br_netfilter bridge stp llc aufs overlay nls_utf8 isofs crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd input_leds serio_raw i2c_piix4 hv_balloon hyperv_fb 8250_fintek joydev mac_hid autofs4 hid_generic hv_utils hid_hyperv ptp hv_netvsc hid hv_storvsc pps_core
[Fri Mar 19 20:03:04 2021] hyperv_keyboard scsi_transport_fc psmouse pata_acpi hv_vmbus floppy fjes
[Fri Mar 19 20:03:04 2021] CPU: 1 PID: 24869 Comm: gdb Tainted: G W 4.4.0-204-generic #236-Ubuntu
[Fri Mar 19 20:03:04 2021] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 05/18/2018
[Fri Mar 19 20:03:04 2021] task: ffff880db9229c80 ti: ffff880d93b9c000 task.ti: ffff880d93b9c000
[Fri Mar 19 20:03:04 2021] RIP: 0010:[<ffffffff811cd93e>] [<ffffffff811cd93e>] handle_mm_fault+0x13de/0x1b80
[Fri Mar 19 20:03:04 2021] RSP: 0018:ffff880d93b9fc28 EFLAGS: 00010246
[Fri Mar 19 20:03:04 2021] RAX: 0000000000000100 RBX: 0000000000000000 RCX: 0000000000000120
[Fri Mar 19 20:03:04 2021] RDX: ffff880ea635f3e8 RSI: 00003ffffffff000 RDI: 0000000000000000
[Fri Mar 19 20:03:04 2021] RBP: ffff880d93b9fce8 R08: 00003ff32179a120 R09: 000000000000007d
[Fri Mar 19 20:03:04 2021] R10: ffff8800000003e8 R11: 00000000000003e8 R12: ffff8800ea672708
[Fri Mar 19 20:03:04 2021] R13: 0000000000000000 R14: 000000010247d000 R15: ffff8800f27fe400
[Fri Mar 19 20:03:04 2021] FS: 00007fdc26061600(0000) GS:ffff881025640000(0000) knlGS:0000000000000000
[Fri Mar 19 20:03:04 2021] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Fri Mar 19 20:03:04 2021] CR2: 000055e3a0011290 CR3: 0000000d93ba4000 CR4: 0000000000160670
[Fri Mar 19 20:03:04 2021] Stack:
[Fri Mar 19 20:03:04 2021] ffffffff81082929 fffffffffffffffd ffffffff81082252 ffff880d93b9fca8
[Fri Mar 19 20:03:04 2021] ffffffff811c7bca ffff8800f27fe400 000000010247d000 ffff880e74a88090
[Fri Mar 19 20:03:04 2021] 000000003a98d7f0 ffff880e00000001 ffff8800000003e8 0000000000000017
[Fri Mar 19 20:03:04 2021] Call Trace:
[Fri Mar 19 20:03:04 2021] [<ffffffff81082929>] ? mm_access+0x79/0xa0
[Fri Mar 19 20:03:04 2021] [<ffffffff81082252>] ? mmput+0x12/0x130
[Fri Mar 19 20:03:04 2021] [<ffffffff811c7bca>] ? follow_page_pte+0x1ca/0x3d0
[Fri Mar 19 20:03:04 2021] [<ffffffff811c7fe4>] ? follow_page_mask+0x214/0x3a0
[Fri Mar 19 20:03:04 2021] [<ffffffff811c82a0>] __get_user_pages+0x130/0x680
[Fri Mar 19 20:03:04 2021] [<ffffffff8122b248>] ? path_openat+0x348/0x1360
[Fri Mar 19 20:03:04 2021] [<ffffffff811c8b74>] get_user_pages+0x34/0x40
[Fri Mar 19 20:03:04 2021] [<ffffffff811c90f4>] __access_remote_vm+0xe4/0x2d0
[Fri Mar 19 20:03:04 2021] [<ffffffff811ef6ac>] ? alloc_pages_current+0x8c/0x110
[Fri Mar 19 20:03:04 2021] [<ffffffff811cfe3f>] access_remote_vm+0x1f/0x30
[Fri Mar 19 20:03:04 2021] [<ffffffff8128d3fa>] mem_rw.isra.16+0xfa/0x190
[Fri Mar 19 20:03:04 2021] [<ffffffff8128d4c8>] mem_read+0x18/0x20
[Fri Mar 19 20:03:04 2021] [<ffffffff8121c89b>] __vfs_read+0x1b/0x40
[Fri Mar 19 20:03:04 2021] [<ffffffff8121d016>] vfs_read+0x86/0x130
[Fri Mar 19 20:03:04 2021] [<ffffffff8121df65>] SyS_pread64+0x95/0xb0
[Fri Mar 19 20:03:04 2021] [<ffffffff8186acdb>] entry_SYSCALL_64_fastpath+0x22/0xd0
[Fri Mar 19 20:03:04 2021] Code: d4 ee ff ff 48 8b 7d 98 89 45 88 e8 2d c7 fd ff 8b 45 88 89 c3 e9 be ee ff ff 48 8b bd 70 ff ff ff e8 c7 cf 69 00 e9 ad ee ff ff <0f> 0b 4c 89 e7 4c 89 9d 70 ff ff ff e8 f1 c9 00 00 85 c0 4c 8b
[Fri Mar 19 20:03:04 2021] RIP [<ffffffff811cd93e>] handle_mm_fault+0x13de/0x1b80
[Fri Mar 19 20:03:04 2021] RSP <ffff880d93b9fc28>
[Fri Mar 19 20:03:04 2021] ---[ end trace 9d28a7e662aea7df ]---
[Fri Mar 19 20:03:04 2021] systemd-journald[581]: Compressed data object 806 -> 548 using XZ


------------------------

We think the following code may be relevant to the crashing behavior.
I think this is the relevant source for Ubuntu 4.4.0-204 (BTW, are you sure this is Ubuntu 20.04? 4.4.0 is a Xenial kernel):
memory.c\mm - ~ubuntu-kernel/ubuntu/+source/linux/+git/xenial - [no description] (launchpad.net)

static int do_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
unsigned long addr, pte_t pte, pte_t *ptep, pmd_t *pmd)
{
...
/* A PROT_NONE fault should not end up here */
BUG_ON(!(vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE)));  Line 3214


We see following fix but we are not certain if it's relevant yet.
This is interesting… mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing · torvalds/linux@38e0885 · GitHub
mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing · torvalds/linux@38e0885 · GitHub

mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing
The NUMA balancing logic uses an arch-specific PROT_NONE page table flag
defined by pte_protnone() or pmd_protnone() to mark PTEs or huge page
PMDs respectively as requiring balancing upon a subsequent page fault.
User-defined PROT_NONE memory regions which also have this flag set will
not normally invoke the NUMA balancing code as do_page_fault() will send
a segfault to the process before handle_mm_fault() is even called.

However if access_remote_vm() is invoked to access a PROT_NONE region of
memory, handle_mm_fault() is called via faultin_page() and
__get_user_pages() without any access checks being performed, meaning
the NUMA balancing logic is incorrectly invoked on a non-NUMA memory
region.

A simple means of triggering this problem is to access PROT_NONE mmap'd
memory using /proc/self/mem which reliably results in the NUMA handling
functions being invoked when CONFIG_NUMA_BALANCING is set.

This issue was reported in bugzilla (issue 99101) which includes some
simple repro code.

There are BUG_ON() checks in do_numa_page() and do_huge_pmd_numa_page()
added at commit c0e7cad to avoid accidentally provoking strange
behavior by attempting to apply NUMA balancing to pages that are in
fact PROT_NONE. The BUG_ON()'s are consistently triggered by the repro.

This patch moves the PROT_NONE check into mm/memory.c rather than
invoking BUG_ON() as faulting in these pages via faultin_page() is a
valid reason for reaching the NUMA check with the PROT_NONE page table
flag set and is therefore not always a bug.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=99101

We need help in understanding how to prevent core dump/kernel panic
while taking memory dump of a focal container on a xenial host.

[Test Plan]

Testing on an 16.04 Azure instance, follow the steps:

$ echo 'GRUB_FLAVOUR_ORDER="generic"' | sudo tee -a
/etc/default/grub.d/99-custom.cfg

$ sudo apt install linux-generic

$ sudo reboot

# login again and confirm the system is booted with the 4.4 kernel

$ sudo apt install docker.io gdb

$ sudo docker pull mcr.microsoft.com/mssql/server:2019-latest

$ sudo docker run -e "ACCEPT_EULA=Y" -e "SA_PASSWORD=<YourStrong@Passw0rd>" \
   -p 1433:1433 --name sql1 -h sql1 \
   -d mcr.microsoft.com/mssql/server:2019-latest

$ps -ef | grep sqlservr

sudo gdb -p $PID -ex generate-core-file

# A kernel BUG should be triggered

[Where problems could occur]

The patches touches the mm subsystem and because of that there's always
the potential for significant regressions and in this case a revert and
a re-spin would probably be necessary.

On the other hand however, this patch is included into the mainline
kernel since 4.8 without problems.

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: Incomplete

** Affects: linux (Ubuntu Xenial)
     Importance: Undecided
         Status: In Progress


** Tags: xenial

** Also affects: linux (Ubuntu Xenial)
   Importance: Undecided
       Status: New

** Changed in: linux (Ubuntu Xenial)
       Status: New => In Progress

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1921211

Title:
  Taking a memory dump of user mode process on Xenial hosts causes
  bugcheck/kernel panic and core dump

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Xenial:
  In Progress

Bug description:
  [Impact]

  We have some Ubuntu 16.04 hosts (in Hyper-V) being used for testing some Ubuntu 20.04 container. As part of the testing we were attempting to take a memory dump of a container running SQL Server with Ubuntu 20.04 on the Ubuntu 16.04 host we started seeing kernel panic and core dump. It started happening after a specific Xenial kernel update on the host.
  4.4.0-204-generic - Systems that are crashing
  4.4.0-201-generic - Systems that are able to capture dump


  Note from the developer indicates following logging showing up.
  ----
  Now the following is output right after I attempt to start the dump. (gdb, attach ###, generate-core-file /var/opt/mssql/log/rdorr.delme.core)

  [Fri Mar 19 20:01:38 2021] systemd-journald[581]: Successfully sent stream file descriptor to service manager.
  [Fri Mar 19 20:01:41 2021] cni0: port 9(vethdec5d2b7) entered forwarding state
  [Fri Mar 19 20:02:42 2021] systemd-journald[581]: Successfully sent stream file descriptor to service manager.
  [Fri Mar 19 20:03:04 2021] ------------[ cut here ]------------
  [Fri Mar 19 20:03:04 2021] kernel BUG at /build/linux-qlAbvR/linux-4.4.0/mm/memory.c:3214!
  [Fri Mar 19 20:03:04 2021] invalid opcode: 0000 [#1] SMP
  [Fri Mar 19 20:03:04 2021] Modules linked in: veth vxlan ip6_udp_tunnel udp_tunnel xt_statistic xt_nat ipt_REJECT nf_reject_ipv4 xt_tcpudp ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs libcrc32c ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6_tables xt_comment xt_mark xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables br_netfilter bridge stp llc aufs overlay nls_utf8 isofs crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd input_leds serio_raw i2c_piix4 hv_balloon hyperv_fb 8250_fintek joydev mac_hid autofs4 hid_generic hv_utils hid_hyperv ptp hv_netvsc hid hv_storvsc pps_core
  [Fri Mar 19 20:03:04 2021] hyperv_keyboard scsi_transport_fc psmouse pata_acpi hv_vmbus floppy fjes
  [Fri Mar 19 20:03:04 2021] CPU: 1 PID: 24869 Comm: gdb Tainted: G W 4.4.0-204-generic #236-Ubuntu
  [Fri Mar 19 20:03:04 2021] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 05/18/2018
  [Fri Mar 19 20:03:04 2021] task: ffff880db9229c80 ti: ffff880d93b9c000 task.ti: ffff880d93b9c000
  [Fri Mar 19 20:03:04 2021] RIP: 0010:[<ffffffff811cd93e>] [<ffffffff811cd93e>] handle_mm_fault+0x13de/0x1b80
  [Fri Mar 19 20:03:04 2021] RSP: 0018:ffff880d93b9fc28 EFLAGS: 00010246
  [Fri Mar 19 20:03:04 2021] RAX: 0000000000000100 RBX: 0000000000000000 RCX: 0000000000000120
  [Fri Mar 19 20:03:04 2021] RDX: ffff880ea635f3e8 RSI: 00003ffffffff000 RDI: 0000000000000000
  [Fri Mar 19 20:03:04 2021] RBP: ffff880d93b9fce8 R08: 00003ff32179a120 R09: 000000000000007d
  [Fri Mar 19 20:03:04 2021] R10: ffff8800000003e8 R11: 00000000000003e8 R12: ffff8800ea672708
  [Fri Mar 19 20:03:04 2021] R13: 0000000000000000 R14: 000000010247d000 R15: ffff8800f27fe400
  [Fri Mar 19 20:03:04 2021] FS: 00007fdc26061600(0000) GS:ffff881025640000(0000) knlGS:0000000000000000
  [Fri Mar 19 20:03:04 2021] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [Fri Mar 19 20:03:04 2021] CR2: 000055e3a0011290 CR3: 0000000d93ba4000 CR4: 0000000000160670
  [Fri Mar 19 20:03:04 2021] Stack:
  [Fri Mar 19 20:03:04 2021] ffffffff81082929 fffffffffffffffd ffffffff81082252 ffff880d93b9fca8
  [Fri Mar 19 20:03:04 2021] ffffffff811c7bca ffff8800f27fe400 000000010247d000 ffff880e74a88090
  [Fri Mar 19 20:03:04 2021] 000000003a98d7f0 ffff880e00000001 ffff8800000003e8 0000000000000017
  [Fri Mar 19 20:03:04 2021] Call Trace:
  [Fri Mar 19 20:03:04 2021] [<ffffffff81082929>] ? mm_access+0x79/0xa0
  [Fri Mar 19 20:03:04 2021] [<ffffffff81082252>] ? mmput+0x12/0x130
  [Fri Mar 19 20:03:04 2021] [<ffffffff811c7bca>] ? follow_page_pte+0x1ca/0x3d0
  [Fri Mar 19 20:03:04 2021] [<ffffffff811c7fe4>] ? follow_page_mask+0x214/0x3a0
  [Fri Mar 19 20:03:04 2021] [<ffffffff811c82a0>] __get_user_pages+0x130/0x680
  [Fri Mar 19 20:03:04 2021] [<ffffffff8122b248>] ? path_openat+0x348/0x1360
  [Fri Mar 19 20:03:04 2021] [<ffffffff811c8b74>] get_user_pages+0x34/0x40
  [Fri Mar 19 20:03:04 2021] [<ffffffff811c90f4>] __access_remote_vm+0xe4/0x2d0
  [Fri Mar 19 20:03:04 2021] [<ffffffff811ef6ac>] ? alloc_pages_current+0x8c/0x110
  [Fri Mar 19 20:03:04 2021] [<ffffffff811cfe3f>] access_remote_vm+0x1f/0x30
  [Fri Mar 19 20:03:04 2021] [<ffffffff8128d3fa>] mem_rw.isra.16+0xfa/0x190
  [Fri Mar 19 20:03:04 2021] [<ffffffff8128d4c8>] mem_read+0x18/0x20
  [Fri Mar 19 20:03:04 2021] [<ffffffff8121c89b>] __vfs_read+0x1b/0x40
  [Fri Mar 19 20:03:04 2021] [<ffffffff8121d016>] vfs_read+0x86/0x130
  [Fri Mar 19 20:03:04 2021] [<ffffffff8121df65>] SyS_pread64+0x95/0xb0
  [Fri Mar 19 20:03:04 2021] [<ffffffff8186acdb>] entry_SYSCALL_64_fastpath+0x22/0xd0
  [Fri Mar 19 20:03:04 2021] Code: d4 ee ff ff 48 8b 7d 98 89 45 88 e8 2d c7 fd ff 8b 45 88 89 c3 e9 be ee ff ff 48 8b bd 70 ff ff ff e8 c7 cf 69 00 e9 ad ee ff ff <0f> 0b 4c 89 e7 4c 89 9d 70 ff ff ff e8 f1 c9 00 00 85 c0 4c 8b
  [Fri Mar 19 20:03:04 2021] RIP [<ffffffff811cd93e>] handle_mm_fault+0x13de/0x1b80
  [Fri Mar 19 20:03:04 2021] RSP <ffff880d93b9fc28>
  [Fri Mar 19 20:03:04 2021] ---[ end trace 9d28a7e662aea7df ]---
  [Fri Mar 19 20:03:04 2021] systemd-journald[581]: Compressed data object 806 -> 548 using XZ

  
  ------------------------

  We think the following code may be relevant to the crashing behavior.
  I think this is the relevant source for Ubuntu 4.4.0-204 (BTW, are you sure this is Ubuntu 20.04? 4.4.0 is a Xenial kernel):
  memory.c\mm - ~ubuntu-kernel/ubuntu/+source/linux/+git/xenial - [no description] (launchpad.net)

  static int do_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
  unsigned long addr, pte_t pte, pte_t *ptep, pmd_t *pmd)
  {
  ...
  /* A PROT_NONE fault should not end up here */
  BUG_ON(!(vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE)));  Line 3214

  
  We see following fix but we are not certain if it's relevant yet.
  This is interesting… mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing · torvalds/linux@38e0885 · GitHub
  mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing · torvalds/linux@38e0885 · GitHub

  mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing
  The NUMA balancing logic uses an arch-specific PROT_NONE page table flag
  defined by pte_protnone() or pmd_protnone() to mark PTEs or huge page
  PMDs respectively as requiring balancing upon a subsequent page fault.
  User-defined PROT_NONE memory regions which also have this flag set will
  not normally invoke the NUMA balancing code as do_page_fault() will send
  a segfault to the process before handle_mm_fault() is even called.

  However if access_remote_vm() is invoked to access a PROT_NONE region of
  memory, handle_mm_fault() is called via faultin_page() and
  __get_user_pages() without any access checks being performed, meaning
  the NUMA balancing logic is incorrectly invoked on a non-NUMA memory
  region.

  A simple means of triggering this problem is to access PROT_NONE mmap'd
  memory using /proc/self/mem which reliably results in the NUMA handling
  functions being invoked when CONFIG_NUMA_BALANCING is set.

  This issue was reported in bugzilla (issue 99101) which includes some
  simple repro code.

  There are BUG_ON() checks in do_numa_page() and do_huge_pmd_numa_page()
  added at commit c0e7cad to avoid accidentally provoking strange
  behavior by attempting to apply NUMA balancing to pages that are in
  fact PROT_NONE. The BUG_ON()'s are consistently triggered by the repro.

  This patch moves the PROT_NONE check into mm/memory.c rather than
  invoking BUG_ON() as faulting in these pages via faultin_page() is a
  valid reason for reaching the NUMA check with the PROT_NONE page table
  flag set and is therefore not always a bug.
  Link: https://bugzilla.kernel.org/show_bug.cgi?id=99101

  We need help in understanding how to prevent core dump/kernel panic
  while taking memory dump of a focal container on a xenial host.

  [Test Plan]

  Testing on an 16.04 Azure instance, follow the steps:

  $ echo 'GRUB_FLAVOUR_ORDER="generic"' | sudo tee -a
  /etc/default/grub.d/99-custom.cfg

  $ sudo apt install linux-generic

  $ sudo reboot

  # login again and confirm the system is booted with the 4.4 kernel

  $ sudo apt install docker.io gdb

  $ sudo docker pull mcr.microsoft.com/mssql/server:2019-latest

  $ sudo docker run -e "ACCEPT_EULA=Y" -e "SA_PASSWORD=<YourStrong@Passw0rd>" \
     -p 1433:1433 --name sql1 -h sql1 \
     -d mcr.microsoft.com/mssql/server:2019-latest

  $ps -ef | grep sqlservr

  sudo gdb -p $PID -ex generate-core-file

  # A kernel BUG should be triggered

  [Where problems could occur]

  The patches touches the mm subsystem and because of that there's
  always the potential for significant regressions and in this case a
  revert and a re-spin would probably be necessary.

  On the other hand however, this patch is included into the mainline
  kernel since 4.8 without problems.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1921211/+subscriptions


Follow ups