kernel-packages team mailing list archive
-
kernel-packages team
-
Mailing list archive
-
Message #81149
[Bug 1328088] Re: Kernel network namespace performance regression during rcu development on kernels above 3.8
This bug was fixed in the package linux - 3.13.0-36.63
---------------
linux (3.13.0-36.63) trusty; urgency=low
[ Joseph Salisbury ]
* Release Tracking Bug
- LP: #1365052
[ Feng Kan ]
* SAUCE: (no-up) irqchip:gic: change access of gicc_ctrl register to read
modify write.
- LP: #1357527
* SAUCE: (no-up) arm64: optimized copy_to_user and copy_from_user
assembly code
- LP: #1358949
[ Ming Lei ]
* SAUCE: (no-up) Drop APM X-Gene SoC Ethernet driver
- LP: #1360140
* [Config] Drop XGENE entries
- LP: #1360140
* [Config] CONFIG_NET_XGENE=m for arm64
- LP: #1360140
[ Stefan Bader ]
* SAUCE: Add compat macro for skb_get_hash
- LP: #1358162
* SAUCE: bcache: prevent crash on changing writeback_running
- LP: #1357295
[ Suman Tripathi ]
* SAUCE: (no-up) arm64: Fix the csr-mask for APM X-Gene SoC AHCI SATA PHY
clock DTS node.
- LP: #1359489
* SAUCE: (no-up) ahci_xgene: Skip the PHY and clock initialization if
already configured by the firmware.
- LP: #1359501
* SAUCE: (no-up) ahci_xgene: Fix the link down in first attempt for the
APM X-Gene SoC AHCI SATA host controller driver.
- LP: #1359507
[ Tuan Phan ]
* SAUCE: (no-up) pci-xgene-msi: fixed deadlock in irq_set_affinity
- LP: #1359514
[ Upstream Kernel Changes ]
* iwlwifi: mvm: Add a missed beacons threshold
- LP: #1349572
* mac80211: reset probe_send_count also in HW_CONNECTION_MONITOR case
- LP: #1349572
* genirq: Add an accessor for IRQ_PER_CPU flag
- LP: #1357527
* arm64: perf: add support for percpu pmu interrupt
- LP: #1357527
* cifs: sanity check length of data to send before sending
- LP: #1283101
* KVM: nVMX: Pass vmexit parameters to nested_vmx_vmexit
- LP: #1329434
* KVM: nVMX: Rework interception of IRQs and NMIs
- LP: #1329434
* KVM: vmx: disable APIC virtualization in nested guests
- LP: #1329434
* HID: Add transport-driver functions to the USB HID interface.
- LP: #1353021
* ahci_xgene: Removing NCQ support from the APM X-Gene SoC AHCI SATA Host
Controller driver.
- LP: #1358498
* fold d_kill() and d_free()
- LP: #1354234
* fold try_prune_one_dentry()
- LP: #1354234
* new helper: dentry_free()
- LP: #1354234
* expand the call of dentry_lru_del() in dentry_kill()
- LP: #1354234
* dentry_kill(): don't try to remove from shrink list
- LP: #1354234
* don't remove from shrink list in select_collect()
- LP: #1354234
* more graceful recovery in umount_collect()
- LP: #1354234
* dcache: don't need rcu in shrink_dentry_list()
- LP: #1354234
* lift the "already marked killed" case into shrink_dentry_list()
* split dentry_kill()
- LP: #1354234
* expand dentry_kill(dentry, 0) in shrink_dentry_list()
- LP: #1354234
* shrink_dentry_list(): take parent's ->d_lock earlier
- LP: #1354234
* dealing with the rest of shrink_dentry_list() livelock
- LP: #1354234
* dentry_kill() doesn't need the second argument now
- LP: #1354234
* dcache: add missing lockdep annotation
- LP: #1354234
* fs: convert use of typedef ctl_table to struct ctl_table
- LP: #1354234
* lock_parent: don't step on stale ->d_parent of all-but-freed one
- LP: #1354234
* tools/testing/selftests/ptrace/peeksiginfo.c: add PAGE_SIZE definition
- LP: #1358855
* x86, irq, pic: Probe for legacy PIC and set legacy_pic appropriately
- LP: #1317697
* bnx2x: Fix kernel crash and data miscompare after EEH recovery
- LP: #1353105
* bnx2x: Adapter not recovery from EEH error injection
- LP: #1353105
* Fix: module signature vs tracepoints: add new TAINT_UNSIGNED_MODULE
- LP: #1359670
* bcache: fix crash on shutdown in passthrough mode
- LP: #1357295
* bcache: fix uninterruptible sleep in writeback thread
- LP: #1357295
* namespaces: Use task_lock and not rcu to protect nsproxy
- LP: #1328088
* MAINTAINERS: Add entry for APM X-Gene SoC ethernet driver
- LP: #1360140
* Documentation: dts: Add bindings for APM X-Gene SoC ethernet driver
- LP: #1360140
* dts: Add bindings for APM X-Gene SoC ethernet driver
- LP: #1360140
* drivers: net: Add APM X-Gene SoC ethernet driver support.
- LP: #1360140
* powerpc/mm: Add new "set" flag argument to pte/pmd update function
- LP: #1357014
* powerpc/thp: Add write barrier after updating the valid bit
- LP: #1357014
* powerpc/thp: Don't recompute vsid and ssize in loop on invalidate
- LP: #1357014
* powerpc/thp: Invalidate old 64K based hash page mapping before insert
of 4k pte
- LP: #1357014
* powerpc/thp: Handle combo pages in invalidate
- LP: #1357014
* powerpc/thp: Invalidate with vpn in loop
- LP: #1357014
* powerpc/thp: Use ACCESS_ONCE when loading pmdp
- LP: #1357014
* powerpc/mm: Use read barrier when creating real_pte
- LP: #1357014
* powerpc/thp: Add tracepoints to track hugepage invalidate
- LP: #1357014
* powerpc: subpage_protect: Increase the array size to take care of 64TB
- LP: #1357014
* mfd: rtsx: Add set pull control macro and simplify rtl8411
- LP: #1361086
* mfd: rtsx: Add support for card reader rtl8402
- LP: #1361086
* kvm: iommu: fix the third parameter of kvm_iommu_put_pages
(CVE-2014-3601)
- LP: #1362443
- CVE-2014-3601
* isofs: Fix unbounded recursion when processing relocated directories
- LP: #1362447, #1362448
- CVE-2014-5472
* net: sctp: inherit auth_capable on INIT collisions
- LP: #1349804
- CVE-2014-5077
* blk-mq: fix initializing request's start time
- LP: #1297522
[ Vinayak Kale ]
* SAUCE: (no-up) dt-bindings: Add Potenza PMU binding
- LP: #1357527
* SAUCE: (no-up) arm64: dts: Add PMU node for APM X-Gene Storm SOC
- LP: #1357527
-- Joseph Salisbury <joseph.salisbury@xxxxxxxxxxxxx> Wed, 03 Sep 2014 12:13:43 -0400
** Changed in: linux (Ubuntu Trusty)
Status: Fix Committed => Fix Released
** CVE added: http://www.cve.mitre.org/cgi-
bin/cvename.cgi?name=2014-3601
** CVE added: http://www.cve.mitre.org/cgi-
bin/cvename.cgi?name=2014-5077
** CVE added: http://www.cve.mitre.org/cgi-
bin/cvename.cgi?name=2014-5472
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1328088
Title:
Kernel network namespace performance regression during rcu development
on kernels above 3.8
Status in The Linux Kernel:
In Progress
Status in “linux” package in Ubuntu:
Fix Released
Status in “linux” source package in Trusty:
Fix Released
Status in “linux” source package in Utopic:
Fix Released
Bug description:
SRU Justification:
Impact: network namespace creation has performance regression since v3.5.
Fix: my analysis, lklm discussion, upstream patch
Testcase:
http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
http://people.canonical.com/~inaddy/lp1328088/parse.py
http://people.canonical.com/~inaddy/lp1328088/charts/250.html
http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html
Running make_fake_routers.sh 4000 and using parse.py you can check if
"fake routers" are being created in a good rate /sec (and you can
compare with all generated charts).
----------------------------
Original Description:
Please, follow this in:
http://people.canonical.com/~inaddy/lp1328088/. Same description on
daily-basis updated text.
--
It was brought to my attention that network namespace creation scalability was affected during kernel development.
The following script was used for all the tests and charts generation:
http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
http://people.canonical.com/~inaddy/lp1328088/parse.py
I measured how many "fake routers" (above script) could be added per
second from 0 to 4000 created routers mark. Using this script and a
git bisect on kernel tree I was led to one specific commit causing
regression: #911af50 "rcu: Provide compile-time control for no-CBs
CPUs". Even Though this change was experimental at that point, it
introduced a performance scalability regression (explained below) that
still last and seems to be the default option for distributions
nowadays.
RCU related code looked like to be responsible for the problem. With
that, every commit from tag v3.8..master that changed any of this
files: "kernel/rcutree.c kernel/rcutree.h kernel/rcutree_plugin.h
include/trace/events/rcu.h include/linux/rcupdate.h" was tested. The
idea was to check performance regression during rcu development. In
the worst case, the regression not being related to rcu, I would still
have data to interpret the performance/scalability regression.
All text below this refer to 2 groups of charts, generated during the
study:
1) Kernel git tags from 3.8 to 3.14.
http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html
2) Kernel git commits for rcu development (111 commits).
http://people.canonical.com/~inaddy/lp1328088/charts/250.html
Since there was difference in results depending on how many cpus or
how the no-cb cpus were configured, 3 kernel config options were used
on every measure:
- CONFIG_RCU_NOCB_CPU (disabled): nocbno
- CONFIG_RCU_NOCB_CPU_ALL (enabled): nocball
- CONFIG_RCU_NOCB_CPU_NONE (enabled): nocbnone
Obs: For 1 cpu cases: nocbno, nocbnone, nocball behaves the same since
w/ only 1 cpu there is no no-cb cpu
After charts being generated it was clear that NOCB_CPU_ALL (4 cpus)
affected the "fake routers" creation process performance and this
regression continues up to upstream version. It was also clear that,
after commit #911af50, having more than 1 cpu does not improve
performance/scalability for netns, makes it worse.
#911af50
...
+#ifdef CONFIG_RCU_NOCB_CPU_ALL
+ pr_info("\tExperimental no-CBs for all CPUs\n");
+ cpumask_setall(rcu_nocb_mask);
+#endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */
...
Comparing standing out points (see charts):
#81e5949 - good
#911af50 - bad
I was able to see that, from the script above, the following lines
causes major impact on netns scalability/performance:
1) ip netns add -> huge performance regression:
1 cpu: no regression
4 cpu: regression for NOCB_CPU_ALL
obs: regression from 250 netns/sec to 50 netns/sec
on 500 netns already created mark
2) ip netns exec -> some performance regression
1 cpu: no regression
4 cpu: regression for NOCB_CPU_ALL
obs: regression from 40 netns (+1 exec per netns
creation) to 20 netns/sec on 500 netns created
mark
# Assumption (to be confirmed)
rcu callbacks being offloaded to other cpus caused regression in
copy_net_ns<-created_new_namespaces or unshare(clone_newnet).
To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1328088/+subscriptions
References