← Back to team overview

kernel-packages team mailing list archive

[Bug 1344323] Re: Trusty kernel network performance regression

 

** Changed in: linux (Ubuntu Trusty)
   Importance: Undecided => Medium

** Changed in: linux (Ubuntu)
   Importance: Undecided => Medium

** Changed in: linux (Ubuntu)
   Importance: Medium => High

** Changed in: linux (Ubuntu Trusty)
   Importance: Medium => High

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1344323

Title:
  Trusty kernel network performance regression

Status in “linux” package in Ubuntu:
  Fix Released
Status in “linux” source package in Trusty:
  In Progress

Bug description:
  SRU Justification:

  Impact:

  Reduced TCP/IP receive performance for network devices that do not
  split packet headers into skb linear area (e.g., mlx4).  The trusty
  kernel has incorporated

  commit eff44f9cc9a02aad53d568d3ae5020b6792ae4f6
  Author: Jerry Chu <hkchu@xxxxxxxxxx>
  Date:   Wed Dec 11 20:53:45 2013 -0800

      net-gro: Prepare GRO stack for the upcoming tunneling support

  which modifies the GRO frag0 optimization, but unfortunately for some
  cases results in calls to __skb_pull_tail for every packet being
  received via the GRO path.  This causes a reduction in TCP receive
  performance (or, more accurately, an increase in CPU load for TCP
  receive processing, which will cause throughput reduction for CPU
  limited workloads).

  Fix:

  This has already been fixed in mainline in

  commit a50e233c50dbc881abaa0e4070789064e8d12d70
  Author: Eric Dumazet <edumazet@xxxxxxxxxx>
  Date:   Sat Mar 29 21:28:21 2014 -0700

      net-gro: restore frag0 optimization

  The fix has been backported to and verified on the trusty kernel using
  mlx4 devices and iperf; an increase from 7.5 to 8.5 Gb/sec was
  observed when adding the patch, and the relevant portion of perf
  captures show changes in the call paths from:

       7.17%            iperf  [kernel.kallsyms]   [k] __pskb_pull_tail                       
                        |
                        --- __pskb_pull_tail
                           |          
                           |--48.03%-- tcp_gro_receive
                           |          tcp4_gro_receive
                           |          inet_gro_receive
                           |          dev_gro_receive
                           |          napi_gro_frags
                           |          mlx4_en_process_rx_cq
                           |          mlx4_en_poll_rx_cq
                           |          net_rx_action
                           |          __do_softirq
  [...]
                           |--28.53%-- napi_gro_frags
                           |          mlx4_en_process_rx_cq
                           |          mlx4_en_poll_rx_cq
                           |          net_rx_action
                           |          __do_softirq
  [...]
                           |--13.11%-- inet_gro_receive
                           |          dev_gro_receive
                           |          napi_gro_frags
                           |          mlx4_en_process_rx_cq
                           |          mlx4_en_poll_rx_cq
                           |          net_rx_action
                           |          __do_softirq

  to:

       4.87%          iperf  [kernel.kallsyms]   [k] skb_gro_receive                        
                      |
                      --- skb_gro_receive
                         |          
                         |--98.13%-- tcp_gro_receive
                         |          tcp4_gro_receive
                         |          inet_gro_receive
                         |          dev_gro_receive
                         |          napi_gro_frags
                         |          mlx4_en_process_rx_cq
                         |          mlx4_en_poll_rx_cq
                         |          net_rx_action
                         |          __do_softirq

  Testcase:

  The fix was tested using mlx4 10Gb/sec network devices between two
  arm64 systems using "iperf -s" on one end and "iperf -c" on the other.
  The unmodified kernel reported approximately 7.5 Gb/sec throughput,
  the fixed kernel approximately 8.5 Gb/sec.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1344323/+subscriptions


References