kernel-packages team mailing list archive
-
kernel-packages team
-
Mailing list archive
-
Message #71766
[Bug 1344323] [NEW] Trusty kernel network performance regression
Public bug reported:
SRU Justification:
Impact:
Reduced TCP/IP receive performance for network devices that do not split
packet headers into skb linear area (e.g., mlx4). The trusty kernel has
incorporated
commit eff44f9cc9a02aad53d568d3ae5020b6792ae4f6
Author: Jerry Chu <hkchu@xxxxxxxxxx>
Date: Wed Dec 11 20:53:45 2013 -0800
net-gro: Prepare GRO stack for the upcoming tunneling support
which modifies the GRO frag0 optimization, but unfortunately for some
cases results in calls to __skb_pull_tail for every packet being
received via the GRO path. This causes a reduction in TCP receive
performance (or, more accurately, an increase in CPU load for TCP
receive processing, which will cause throughput reduction for CPU
limited workloads).
Fix:
This has already been fixed in mainline in
commit a50e233c50dbc881abaa0e4070789064e8d12d70
Author: Eric Dumazet <edumazet@xxxxxxxxxx>
Date: Sat Mar 29 21:28:21 2014 -0700
net-gro: restore frag0 optimization
The fix has been backported to and verified on the trusty kernel using
mlx4 devices and iperf; an increase from 7.5 to 8.5 Gb/sec was observed
when adding the patch, and the relevant portion of perf captures show
changes in the call paths from:
7.17% iperf [kernel.kallsyms] [k] __pskb_pull_tail
|
--- __pskb_pull_tail
|
|--48.03%-- tcp_gro_receive
| tcp4_gro_receive
| inet_gro_receive
| dev_gro_receive
| napi_gro_frags
| mlx4_en_process_rx_cq
| mlx4_en_poll_rx_cq
| net_rx_action
| __do_softirq
[...]
|--28.53%-- napi_gro_frags
| mlx4_en_process_rx_cq
| mlx4_en_poll_rx_cq
| net_rx_action
| __do_softirq
[...]
|--13.11%-- inet_gro_receive
| dev_gro_receive
| napi_gro_frags
| mlx4_en_process_rx_cq
| mlx4_en_poll_rx_cq
| net_rx_action
| __do_softirq
to:
4.87% iperf [kernel.kallsyms] [k] skb_gro_receive
|
--- skb_gro_receive
|
|--98.13%-- tcp_gro_receive
| tcp4_gro_receive
| inet_gro_receive
| dev_gro_receive
| napi_gro_frags
| mlx4_en_process_rx_cq
| mlx4_en_poll_rx_cq
| net_rx_action
| __do_softirq
Testcase:
The fix was tested using mlx4 10Gb/sec network devices between two arm64
systems using "iperf -s" on one end and "iperf -c" on the other. The
unmodified kernel reported approximately 7.5 Gb/sec throughput, the
fixed kernel approximately 8.5 Gb/sec.
** Affects: linux (Ubuntu)
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1344323
Title:
Trusty kernel network performance regression
Status in “linux” package in Ubuntu:
New
Bug description:
SRU Justification:
Impact:
Reduced TCP/IP receive performance for network devices that do not
split packet headers into skb linear area (e.g., mlx4). The trusty
kernel has incorporated
commit eff44f9cc9a02aad53d568d3ae5020b6792ae4f6
Author: Jerry Chu <hkchu@xxxxxxxxxx>
Date: Wed Dec 11 20:53:45 2013 -0800
net-gro: Prepare GRO stack for the upcoming tunneling support
which modifies the GRO frag0 optimization, but unfortunately for some
cases results in calls to __skb_pull_tail for every packet being
received via the GRO path. This causes a reduction in TCP receive
performance (or, more accurately, an increase in CPU load for TCP
receive processing, which will cause throughput reduction for CPU
limited workloads).
Fix:
This has already been fixed in mainline in
commit a50e233c50dbc881abaa0e4070789064e8d12d70
Author: Eric Dumazet <edumazet@xxxxxxxxxx>
Date: Sat Mar 29 21:28:21 2014 -0700
net-gro: restore frag0 optimization
The fix has been backported to and verified on the trusty kernel using
mlx4 devices and iperf; an increase from 7.5 to 8.5 Gb/sec was
observed when adding the patch, and the relevant portion of perf
captures show changes in the call paths from:
7.17% iperf [kernel.kallsyms] [k] __pskb_pull_tail
|
--- __pskb_pull_tail
|
|--48.03%-- tcp_gro_receive
| tcp4_gro_receive
| inet_gro_receive
| dev_gro_receive
| napi_gro_frags
| mlx4_en_process_rx_cq
| mlx4_en_poll_rx_cq
| net_rx_action
| __do_softirq
[...]
|--28.53%-- napi_gro_frags
| mlx4_en_process_rx_cq
| mlx4_en_poll_rx_cq
| net_rx_action
| __do_softirq
[...]
|--13.11%-- inet_gro_receive
| dev_gro_receive
| napi_gro_frags
| mlx4_en_process_rx_cq
| mlx4_en_poll_rx_cq
| net_rx_action
| __do_softirq
to:
4.87% iperf [kernel.kallsyms] [k] skb_gro_receive
|
--- skb_gro_receive
|
|--98.13%-- tcp_gro_receive
| tcp4_gro_receive
| inet_gro_receive
| dev_gro_receive
| napi_gro_frags
| mlx4_en_process_rx_cq
| mlx4_en_poll_rx_cq
| net_rx_action
| __do_softirq
Testcase:
The fix was tested using mlx4 10Gb/sec network devices between two
arm64 systems using "iperf -s" on one end and "iperf -c" on the other.
The unmodified kernel reported approximately 7.5 Gb/sec throughput,
the fixed kernel approximately 8.5 Gb/sec.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1344323/+subscriptions
Follow ups
-
[Bug 1344323] Re: Trusty kernel network performance regression
From: Launchpad Bug Tracker, 2014-08-28
-
[Bug 1344323] Re: Trusty kernel network performance regression
From: dann frazier, 2014-08-22
-
[Bug 1344323] Re: Trusty kernel network performance regression
From: Tim Gardner, 2014-08-21
-
[Bug 1344323] Re: Trusty kernel network performance regression
From: Launchpad Bug Tracker, 2014-08-19
-
[Bug 1344323] Re: Trusty kernel network performance regression
From: Brad Figg, 2014-08-19
-
[Bug 1344323] Re: Trusty kernel network performance regression
From: Launchpad Bug Tracker, 2014-08-19
-
[Bug 1344323] Re: Trusty kernel network performance regression
From: Joseph Salisbury, 2014-07-22
-
[Bug 1344323] Re: Trusty kernel network performance regression
From: Chris J Arges, 2014-07-21
-
[Bug 1344323] Missing required logs.
From: Brad Figg, 2014-07-18
-
[Bug 1344323] [NEW] Trusty kernel network performance regression
From: Jay Vosburgh, 2014-07-18
References