kernel-packages team mailing list archive
-
kernel-packages team
-
Mailing list archive
-
Message #77627
[Bug 1328088] Re: Kernel network namespace performance regression during rcu development on kernels above 3.8
This bug was fixed in the package linux - 3.16.0-11.16
---------------
linux (3.16.0-11.16) utopic; urgency=low
[ Mauricio Faria de Oliveira ]
* [Config] Switch kernel to vmlinuz (from vmlinux) on ppc64el
- LP: #1358920
[ Peter Zijlstra ]
* SAUCE: (no-up) mmu_notifier: add call_srcu and sync function for listener to delay call and sync
- LP: #1361300
[ Tim Gardner ]
* [Config] CONFIG_ZPOOL=y
- LP: #1360428
* Release Tracking Bug
- LP: #1361308
[ Upstream Kernel Changes ]
* Revert "net/mlx4_en: Fix bad use of dev_id"
- LP: #1347012
* net/mlx4_en: Reduce memory consumption on kdump kernel
- LP: #1347012
* net/mlx4_en: Fix mac_hash database inconsistency
- LP: #1347012
* net/mlx4_en: Disable blueflame using ethtool private flags
- LP: #1347012
* net/mlx4_en: current_mac isn't updated in port up
- LP: #1347012
* net/mlx4_core: Use low memory profile on kdump kernel
- LP: #1347012
* Drivers: scsi: storvsc: Change the limits to reflect the values on the host
- LP: #1347169
* Drivers: scsi: storvsc: Set cmd_per_lun to reflect value supported by the Host
- LP: #1347169
* Drivers: scsi: storvsc: Filter commands based on the storage protocol version
- LP: #1347169
* Drivers: scsi: storvsc: Fix a bug in handling VMBUS protocol version
- LP: #1347169
* Drivers: scsi: storvsc: Implement a eh_timed_out handler
- LP: #1347169
* drivers: scsi: storvsc: Set srb_flags in all cases
- LP: #1347169
* drivers: scsi: storvsc: Correctly handle TEST_UNIT_READY failure
- LP: #1347169
* namespaces: Use task_lock and not rcu to protect nsproxy
- LP: #1328088
* net: xgene: Check negative return value of xgene_enet_get_ring_size()
* mm/zbud: change zbud_alloc size type to size_t
- LP: #1360428
* mm/zpool: implement common zpool api to zbud/zsmalloc
- LP: #1360428
* mm/zpool: zbud/zsmalloc implement zpool
- LP: #1360428
* mm/zpool: update zswap to use zpool
- LP: #1360428
* ideapad-laptop: Change Lenovo Yoga 2 series rfkill handling
- LP: #1341296
* iommu/amd: Fix for pasid initialization
- LP: #1361300
* iommu/amd: Moving PPR fault flags macros definitions
- LP: #1361300
* iommu/amd: Drop oprofile dependency
- LP: #1361300
* iommu/amd: Fix typo in amd_iommu_v2 driver
- LP: #1361300
* iommu/amd: Don't call mmu_notifer_unregister in __unbind_pasid
- LP: #1361300
* iommu/amd: Don't free pasid_state in mn_release path
- LP: #1361300
* iommu/amd: Get rid of __unbind_pasid
- LP: #1361300
* iommu/amd: Drop pasid_state reference in ppr_notifer error path
- LP: #1361300
* iommu/amd: Add pasid_state->invalid flag
- LP: #1361300
* iommu/amd: Don't hold a reference to mm_struct
- LP: #1361300
* iommu/amd: Don't hold a reference to task_struct
- LP: #1361300
* iommu/amd: Don't call the inv_ctx_cb when pasid is not set up
- LP: #1361300
* iommu/amd: Don't set pasid_state->mm to NULL in unbind_pasid
- LP: #1361300
* iommu/amd: Remove change_pte mmu_notifier call-back
- LP: #1361300
* iommu/amd: Fix device_state reference counting
- LP: #1361300
* iommu/amd: Fix 2 typos in comments
- LP: #1361300
-- Tim Gardner <tim.gardner@xxxxxxxxxxxxx> Fri, 22 Aug 2014 08:45:54 -0400
** Changed in: linux (Ubuntu Utopic)
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1328088
Title:
Kernel network namespace performance regression during rcu development
on kernels above 3.8
Status in The Linux Kernel:
In Progress
Status in “linux” package in Ubuntu:
Fix Released
Status in “linux” source package in Trusty:
Fix Committed
Status in “linux” source package in Utopic:
Fix Released
Bug description:
SRU Justification:
Impact: network namespace creation has performance regression since v3.5.
Fix: my analysis, lklm discussion, upstream patch
Testcase:
http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
http://people.canonical.com/~inaddy/lp1328088/parse.py
http://people.canonical.com/~inaddy/lp1328088/charts/250.html
http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html
Running make_fake_routers.sh 4000 and using parse.py you can check if
"fake routers" are being created in a good rate /sec (and you can
compare with all generated charts).
----------------------------
Original Description:
Please, follow this in:
http://people.canonical.com/~inaddy/lp1328088/. Same description on
daily-basis updated text.
--
It was brought to my attention that network namespace creation scalability was affected during kernel development.
The following script was used for all the tests and charts generation:
http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
http://people.canonical.com/~inaddy/lp1328088/parse.py
I measured how many "fake routers" (above script) could be added per
second from 0 to 4000 created routers mark. Using this script and a
git bisect on kernel tree I was led to one specific commit causing
regression: #911af50 "rcu: Provide compile-time control for no-CBs
CPUs". Even Though this change was experimental at that point, it
introduced a performance scalability regression (explained below) that
still last and seems to be the default option for distributions
nowadays.
RCU related code looked like to be responsible for the problem. With
that, every commit from tag v3.8..master that changed any of this
files: "kernel/rcutree.c kernel/rcutree.h kernel/rcutree_plugin.h
include/trace/events/rcu.h include/linux/rcupdate.h" was tested. The
idea was to check performance regression during rcu development. In
the worst case, the regression not being related to rcu, I would still
have data to interpret the performance/scalability regression.
All text below this refer to 2 groups of charts, generated during the
study:
1) Kernel git tags from 3.8 to 3.14.
http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html
2) Kernel git commits for rcu development (111 commits).
http://people.canonical.com/~inaddy/lp1328088/charts/250.html
Since there was difference in results depending on how many cpus or
how the no-cb cpus were configured, 3 kernel config options were used
on every measure:
- CONFIG_RCU_NOCB_CPU (disabled): nocbno
- CONFIG_RCU_NOCB_CPU_ALL (enabled): nocball
- CONFIG_RCU_NOCB_CPU_NONE (enabled): nocbnone
Obs: For 1 cpu cases: nocbno, nocbnone, nocball behaves the same since
w/ only 1 cpu there is no no-cb cpu
After charts being generated it was clear that NOCB_CPU_ALL (4 cpus)
affected the "fake routers" creation process performance and this
regression continues up to upstream version. It was also clear that,
after commit #911af50, having more than 1 cpu does not improve
performance/scalability for netns, makes it worse.
#911af50
...
+#ifdef CONFIG_RCU_NOCB_CPU_ALL
+ pr_info("\tExperimental no-CBs for all CPUs\n");
+ cpumask_setall(rcu_nocb_mask);
+#endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */
...
Comparing standing out points (see charts):
#81e5949 - good
#911af50 - bad
I was able to see that, from the script above, the following lines
causes major impact on netns scalability/performance:
1) ip netns add -> huge performance regression:
1 cpu: no regression
4 cpu: regression for NOCB_CPU_ALL
obs: regression from 250 netns/sec to 50 netns/sec
on 500 netns already created mark
2) ip netns exec -> some performance regression
1 cpu: no regression
4 cpu: regression for NOCB_CPU_ALL
obs: regression from 40 netns (+1 exec per netns
creation) to 20 netns/sec on 500 netns created
mark
# Assumption (to be confirmed)
rcu callbacks being offloaded to other cpus caused regression in
copy_net_ns<-created_new_namespaces or unshare(clone_newnet).
To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1328088/+subscriptions
References