group.of.nepali.translators team mailing list archive

Thread
Date
[Bug 1873074] Re: kernel panic hit by kube-proxy iptables-save/restore caused by aufs

To: group.of.nepali.translators@xxxxxxxxxxxxxxxxxxx
From: Mauricio Faria de Oliveira <1873074@xxxxxxxxxxxxxxxxxx>
Date: Wed, 22 Jul 2020 13:55:01 -0000
Reply-to: Bug 1873074 <1873074@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx
Marking as fix released for X/B/F on kernel packages versions:
- Xenial: 4.4.0-186.216
- Bionic: 4.15.0-112.113
- Focal: 5.4.0-42.46

Covered in USNs:
https://usn.ubuntu.com/4425-1 
https://usn.ubuntu.com/4426-1 
https://usn.ubuntu.com/4427-1

** Changed in: linux (Ubuntu Bionic)
       Status: In Progress => Fix Released

** Changed in: linux (Ubuntu Focal)
       Status: In Progress => Fix Released

** Also affects: linux (Ubuntu Xenial)
   Importance: Undecided
       Status: New

** Changed in: linux (Ubuntu Xenial)
       Status: New => Fix Released

** Changed in: linux (Ubuntu Xenial)
   Importance: Undecided => Medium

** Changed in: linux (Ubuntu Xenial)
     Assignee: (unassigned) => Mauricio Faria de Oliveira (mfo)

** Changed in: linux (Ubuntu Eoan)
       Status: In Progress => Fix Committed

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1873074

Title:
  kernel panic hit by kube-proxy iptables-save/restore caused by aufs

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Xenial:
  Fix Released
Status in linux source package in Bionic:
  Fix Released
Status in linux source package in Eoan:
  Fix Committed
Status in linux source package in Focal:
  Fix Released
Status in linux source package in Groovy:
  Won't Fix

Bug description:
  [Impact]

   * Systems with aufs mounts are vulnerable to a kernel BUG(),
     which can turn into a panic/crash if panic_on_oops is set.

   * It is exploitable by unprivileged local users; and also
     remote access operations (e.g., web server) potentially.

   * This issue has also manifested in Kubernetes deployments
     with a kernel panic in iptables-save or iptables-restore
     after a few weeks of uptime, without user interaction.

   * Usually all Kubernetes worker nodes hit the issue around
     the same time.

  [Fix]

   * The issue is fixed with 2 patches in aufs4-linux.git:
   - 515a586eeef3 aufs: do not call i_readcount_inc()
   - f10aea57d39d aufs: bugfix, IMA i_readcount

   * The first addresses the issue, and the second addresses a
     regression in the aufs feature to change RW branches to RO.

   * The kernel v5.3 aufs patches had an equivalent fix to the
     second patch, which is present in the Focal aufs patchset
     (and on ubuntu-unstable/master & /master-5.8 on 20200629)

   - 1d26f910c53f aufs: for v5.3-rc1, maintain i_readcount
     (in aufs5-linux.git)

  [Test Case]

   * Repeatedly open/close the same file in read-only mode in
     aufs (UINT_MAX times, to overflow a signed int back to 0.)

   * Alternatively, monitor the underlying filesystems's file
     inode.i_readcount over several open/close system calls.
     (should not monotonically increase; rather, return to 0.)

  [Regression Potential]

   * This changes the core path that aufs opens files, so there
     is a risk of regression; however, the fix changes aufs for
     how other filesystems work, so this generally is OK to do.
     In any case, most regressions would manifest in open() or
     close() (where the VFS handles/checks inode.i_readcount.)

   * The aufs maintainer has access to an internal test-suite
     used to validate aufs changes, used to identify the first
     regression (in the branch RW/RO mode change), and then to
     validate/publish the patches upstream; should be good now.

   * This has also been tested with 'stress-ng --class filesystem'
     and with 'xfstests -overlay' (patch to use aufs vs overlayfs)
     on Xenial/Bionic/Focal (-proposed vs. -proposed + patches).
     No regressions observed in stress-ng/xfstests log or dmesg.

  [Other Info]

   * Applied on Unstable (branches master and master-5.8)
   * Not required on Groovy (still 5.4; should sync from Unstable)
   * Required on LTS releases: Bionic and Focal and Xenial.
   * Required on other releases: Disco and Eoan (for custom kernels)

  [Original Bug Description]

  Problem Report:
  --------------

  An user reported several nodes in their Kubernetes clusters
  hit a kernel panic at about the same time, and periodically
  (usually 35 days of uptime, and in same order nodes booted.)

  The kernel panics message/stack trace are consistent across
  nodes, in __fput() by iptables-save/restore from kube-proxy.

  Example:

  """
  [3016161.866702] kernel BUG at .../include/linux/fs.h:2583!
  [3016161.866704] invalid opcode: 0000 [#1] SMP
  ...
  [3016161.866780] CPU: 40 PID: 33068 Comm: iptables-restor Tainted: P OE 4.4.0-133-generic #159-Ubuntu
  ...
  [3016161.866786] RIP: 0010:[...] [...] __fput+0x223/0x230
  ...
  [3016161.866818] Call Trace:
  [3016161.866823] [...] ____fput+0xe/0x10
  [3016161.866827] [...] task_work_run+0x86/0xb0
  [3016161.866831] [...] exit_to_usermode_loop+0xc2/0xd0
  [3016161.866833] [...] syscall_return_slowpath+0x4e/0x60
  [3016161.866839] [...] int_ret_from_sys_call+0x25/0x9f
  """

  (uptime: 3016161 seconds / (24*60*60) = 34.90 days)

  They have provided a crashdump (privately available) used
  for analysis later in this bug report.

  Note: the root cause turns out to be independent of K8s,
  as explained in the Root Cause section.

  Related Report:
  --------------

  This behavior matches this public bug of another user:
  https://github.com/kubernetes/kubernetes/issues/70229

  """
  I have several machines happen kernel panic，and these
  machine have same dump trace like below:

  KERNEL: /usr/lib/debug/boot/vmlinux-4.4.0-104-generic
  ...
  PANIC: "kernel BUG at .../include/linux/fs.h:2582!"
  ...
  COMMAND: "iptables-restor"
  ...
  crash> bt
  ...
  [exception RIP: __fput+541]
  ...
  #8 [ffff880199f33e60] __fput at ffffffff812125ac
  #9 [ffff880199f33ea8] ____fput at ffffffff812126ee
  #10 [ffff880199f33eb8] task_work_run at ffffffff8109f101
  #11 [ffff880199f33ef8] exit_to_usermode_loop at ffffffff81003242
  #12 [ffff880199f33f30] syscall_return_slowpath at ffffffff81003c6e
  #13 [ffff880199f33f50] int_ret_from_sys_call at ffffffff818449d0
  ...

  The above showed command "iptables-restor" cause the kernel
  panic and its pid is 16884，its parent process is kube-proxy.

  Sometimes the process of kernel panic is "iptables-save" and
  the dump trace are same.

  The kernel panic always happens every 26 days(machine uptime)
  """

  << Adding further sections as comments to keep page short. >>

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1873074/+subscriptions