kernel-packages team mailing list archive
-
kernel-packages team
-
Mailing list archive
-
Message #149677
[Bug 1521053] Re: Network Performance dropping between vms on different location in Azure
Environment
- Ubuntu trusty 14.04.3 (ubuntu-vivid kernel)
- DS2, West Europe <-> North Europe, Azure
- test app : netcat+nload, iperf
Logs
1. ===================================================================================================================
The customer provide us some analysis about kernel version, which is ok, which is not
Works
ii linux-image-3.16.0-52-generic 3.16.0-52.71~14.04.1 amd64 Linux kernel image for version 3.16.0 on 64 bit x86 SMP
ii linux-image-3.19.0-18-generic 3.19.0-18.18~14.04.1 amd64 Linux kernel image for version 3.19.0 on 64 bit x86 SMP
ii linux-image-3.19.0-20-generic 3.19.0-20.20~14.04.1 amd64 Linux kernel image for version 3.19.0 on 64 bit x86 SMP
ii linux-image-3.19.0-21-generic 3.19.0-21.21~14.04.1 amd64 Linux kernel image for version 3.19.0 on 64 bit x86 SMP
ii linux-image-3.19.0-22-generic 3.19.0-22.22~14.04.1 amd64 Linux kernel image for version 3.19.0 on 64 bit x86 SMP
ii linux-image-3.19.0-23-generic 3.19.0-23.24~14.04.1 amd64 Linux kernel image for version 3.19.0 on 64 bit x86 SMP
ii linux-image-3.19.0-25-generic 3.19.0-25.26~14.04.1 amd64 Linux kernel image for version 3.19.0 on 64 bit x86 SMP
ii linux-image-3.19.0-26-generic 3.19.0-26.28~14.04.1 amd64 Linux kernel image for version 3.19.0 on 64 bit x86 SMP
Doesnt work
ii linux-image-3.19.0-28-generic 3.19.0-28.30~14.04.1 amd64 Linux kernel image for version 3.19.0 on 64 bit x86 SMP
ii linux-image-3.19.0-30-generic 3.19.0-30.34~14.04.1 amd64 Linux kernel image for version 3.19.0 on 64 bit x86 SMP
ii linux-image-3.19.0-31-generic 3.19.0-31.36~14.04.1 amd64 Linux kernel image for version 3.19.0 on 64 bit x86 SMP
ii linux-image-3.19.0-32-generic 3.19.0-32.37~14.04.1 amd64 Linux kernel image for version 3.19.0 on 64 bit x86 SMP
======================================================================================================================
2.====================================================================================================================
Fail ( dropping )
----------------------------------------------------------------------------------------------------------------------
after bisecting them,
I found below commit is the one which dropping is started
commit 1826dae15f7b5d4742bd54c0392b2280cad0ef60
Author: Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>
Date: Mon Apr 13 16:34:35 2015 -0700
hv_netvsc: Implement partial copy into send buffer
BugLink: http://bugs.launchpad.net/bugs/1454892
If remaining space in a send buffer slot is too small for the whole message,
we only copy the RNDIS header and PPI data into send buffer, so we can batch
one more packet each time. It reduces the vmbus per-message overhead.
Signed-off-by: Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>
Reviewed-by: K. Y. Srinivasan <kys@xxxxxxxxxxxxx>
Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>
(cherry picked from commit aa0a34be68290aa9aa071c0691fb8b6edda38358)
Signed-off-by: Joseph Salisbury <joseph.salisbury@xxxxxxxxxxxxx>
Acked-by: Tim Gardner <tim.gardner@xxxxxxxxxxxxx>
Acked-by: Brad Figg <brad.figg@xxxxxxxxxxxxx>
Signed-off-by: Brad Figg <brad.figg@xxxxxxxxxxxxx>
=====================================================================================================================
3. ==================================================================================================================
PASS ( no dropping )
---------------------------------------------------------------------------------------------------------------------
I tested upstream checkouted with above commit
=====================================================================================================================
4. ==================================================================================================================
After checking differences between upstream's and ubuntu-vivid's "hv_netvsc: Implement partial copy into send buffer"
found several commits between them
981a1bd85a959bb3b44e07c212ebc61c62ad7cf9 hv_netvsc: use single existing drop path in netvsc_start_xmit
e88f7e078e47d4261a22e6f20a574620cbfc7a4b hv_netvsc: try linearizing big SKBs before dropping them
721514222db13498613706709409c21c105e0f4a hv_netvsc: Define a macro RNDIS_AND_PPI_SIZE
0d158852a8089099a6959ae235b20f230871982f hv_netvsc: Clean up two unused variables
59995370dbca7636c105ddadc0447fab86ad3887 hyperv: Implement netvsc_get_channels() ethool op
5ce58c2f13eaa8ca6d7e1041175433bd8cc55756 hv_netvsc: remove vmbus_are_subchannels_present() in rndis_filter_device_add()
999028cc1ccd1cd3a1c0104c6423553d3f573197 hyperv: match wait_for_completion_timeout return type
=====================================================================================================================
5. ==================================================================================================================
after several days testing, I found one which improves performance
0d158852a8089099a6959ae235b20f230871982f hv_netvsc: Clean up two unused variables
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=0d158852a8089099a6959ae235b20f230871982f
=====================================================================================================================
6. ==================================================================================================================
Fail
---------------------------------------------------------------------------------------------------------------------
split out above commit
remove assignment ( see below )
diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index f699236..7e83c6a 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -1011,7 +1011,6 @@ static void netvsc_receive(struct netvsc_device *net_device,
}
count = vmxferpage_packet->range_cnt;
- netvsc_packet->device = device;
netvsc_packet->channel = channel;
/* Each range represents 1 RNDIS pkt that contains 1 ethernet frame */
=====================================================================================================================
7. ==================================================================================================================
Pass
---------------------------------------------------------------------------------------------------------------------
remove header ( of course above should be removed )
diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
index 309adee..95a25e4 100644
--- a/drivers/net/hyperv/hyperv_net.h
+++ b/drivers/net/hyperv/hyperv_net.h
@@ -130,7 +130,6 @@ struct hv_netvsc_packet {
u32 status;
bool part_of_skb;
- struct hv_device *device;
bool is_data_pkt;
bool xmit_more; /* from skb */
u16 vlan_tci;
=====================================================================================================================
8. ==================================================================================================================
Fail
---------------------------------------------------------------------------------------------------------------------
remove the other part
diff --git a/drivers/net/hyperv/rndis_filter.c b/drivers/net/hyperv/rndis_filter.c
index a160437..0d92efe 100644
--- a/drivers/net/hyperv/rndis_filter.c
+++ b/drivers/net/hyperv/rndis_filter.c
@@ -47,8 +47,6 @@ struct rndis_request {
/* Simplify allocation by having a netvsc packet inline */
struct hv_netvsc_packet pkt;
- /* Set 2 pages for rndis requests crossing page boundary */
- struct hv_page_buffer buf[2];
struct rndis_message request_msg;
/*
=====================================================================================================================
9. ==================================================================================================================
weird, so I put a byte on structure
FAil:
- char a; to hv_netvsc_packet structure which device variable is removed. (number 6, 7)
- char a[2];
- char a[3]; // 4 is same as pointer so i didn't test
- char a[5];
- char a[32];
Pass:
- char a; to vmbus_channel structure which is member of hv_netvsc_packet structure
=====================================================================================================================
** Description changed:
[Impact]
- Ubuntu VM in Azure has network performance issue when check by using netcat&nload
- Normal bandwidth is 50MB/s ~ 100MB/s, but it's 0.3MB/s when dropping happens
+ Ubuntu VM in Azure has network performance issue
+ Normal bandwidth is 50MB/s ~ 100MB/s, but it's 0.3MB/s when dropping happens.
[Fix]
Upstream development
0d158852a8089099a6959ae235b20f230871982f ("hv_netvsc: Clean up two unused variables")
- It's affected over 3.19.0-28-generic (vivid)
+ It's affected over 3.19.0-28-generic (ubuntu-vivid)
With this commit, I confirmed that the problem has gone by the testing.
+ Test Logs
+ http://pastebin.ubuntu.com/13657083/
+
[Testcase]
- This is only for Azure service.
+ Make 2 VMs on North Europe, West Europe each.
+ Then run below test script
- Make 2 vms on North Europe, West Europe and run below test script
+ NE VM
- NE
+ - netcat & nload
+ while true; do netcat -l 8080 < /dev/zero; done;
+ nload -u M eth0 ( need nload pkg )
- while true; do netcat -l 8080 < /dev/zero; done;
+ - iperf
+ iperf -s -f M
- nload -u M eth0 ( need nload pkg )
+ WE VM
- WE
+ - netcat
+ for i in {1..1000}
+ do
+ timeout 30s nc NE_HOST 8080 > /dev/null
+ done
- for i in {1..1000}
- do
- timeout 30s nc NE_HOST 8080 > /dev/null
- done
+ - iperf
+ iperf -c HOST -f M
- Network performance dropping can be seen frequently
+ Network performance dropping can be seen frequently in nload graph.
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1521053
Title:
Network Performance dropping between vms on different location in
Azure
Status in linux package in Ubuntu:
Confirmed
Status in linux source package in Vivid:
Confirmed
Bug description:
[Impact]
Ubuntu VM in Azure has network performance issue
Normal bandwidth is 50MB/s ~ 100MB/s, but it's 0.3MB/s when dropping happens.
[Fix]
Upstream development
0d158852a8089099a6959ae235b20f230871982f ("hv_netvsc: Clean up two unused variables")
It's affected over 3.19.0-28-generic (ubuntu-vivid)
With this commit, I confirmed that the problem has gone by the
testing.
Test Logs
http://pastebin.ubuntu.com/13657083/
[Testcase]
Make 2 VMs on North Europe, West Europe each.
Then run below test script
NE VM
- netcat & nload
while true; do netcat -l 8080 < /dev/zero; done;
nload -u M eth0 ( need nload pkg )
- iperf
iperf -s -f M
WE VM
- netcat
for i in {1..1000}
do
timeout 30s nc NE_HOST 8080 > /dev/null
done
- iperf
iperf -c HOST -f M
Network performance dropping can be seen frequently in nload graph.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1521053/+subscriptions
References