← Back to team overview

kernel-packages team mailing list archive

[Bug 1226726] Re: dentry_reset_mounted walks entire mount list holding vfsmount write lock


Precise 3.8.0-33-generic lts-raring large VM:

#ns       ms
1      9
1001      29
2001      34
3001      103
4001      97
5001      212
6001      198
7001      362
8001      333
9001      545
10001      501

** Tags removed: verification-needed-raring
** Tags added: verification-done-raring

You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.

  dentry_reset_mounted walks entire mount list holding vfsmount write

Status in “linux” package in Ubuntu:
  Fix Released
Status in “linux” source package in Precise:
  Won't Fix
Status in “linux” source package in Quantal:
  Fix Committed
Status in “linux” source package in Raring:
  Fix Committed
Status in “linux” source package in Saucy:
  Fix Released

Bug description:
  SRU Justification:

  Impact: When creating thousands of network namespaces the delay in
  executing commands increases exponentially in kernels before 84d17192.

  Fix: In 84d17192 in the upstream kernel, locking code in
  fs/namespace.c is greatly improved resulting in much better
  performance when the number namespaces increase.

  Testcase: Below, test_ns.sh can be run and a graph can be compared
  between the existing version and the patched version.

  Additional Information: Because this is a change in the vfs layer, I
  ran the xfstests and compared before and after results of this patch.
  The patch did not create any additional failures in the generic

  The quantal and raring solutions differ but are both based on the 84d17192
  patch. The quantal solution does a backport of this patch instead of clean
  cherry-picks because of the amount of deps required to just use cherry-picks.
  The raring solution was able to be done with two clean cherry-picks and that's
  why that solution was chosen.


  Whenever one enters a network namespace via "ip netns exec foobar
  somecommand" there is a mount done of the appropriate device on /sys
  since "somecommand" needs to see namespace specific versions of /sys
  directories. When the ip process exits these mounts need to be torn
  down, and that requires a global write lock for vfsmount_lock (this is
  a single writer multiple reader lock). This has serious performance
  implications when the number of name spaces increase.

  The commit 84d17192 addresses this issue, and it is clear by running
  the attached testcase that it fixes performance issues when dealing
  with large numbers of namespaces. I've included a graph with the
  differences in performance between this fix and its parent commit to
  show the the improve in performance. The x-axis represents the number
  of namespaces and the y-axis is execution time in ms. After applying
  the patch the performance delays are not exponentially increasing.

  This affects 3.2/3.5/3.8 series kernels, as it was fixed in 3.10.

To manage notifications about this bug go to: