openstack team mailing list archive

Thread
Date

Re: Moving code hosting to GitHub

To: Robert Collins <robert.collins@xxxxxxxxxxxxx>
From: Thomas Goirand <thomas@xxxxxxxxxx>
Date: Mon, 11 Apr 2011 23:13:48 +0800
Cc: Elliot Murphy <elliot@xxxxxxxxxxxxx>, "openstack@xxxxxxxxxxxxxxxxxxx" <openstack@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <BANLkTimvJZfYnD4r-ga6C+OzrDgq9jwavQ@mail.gmail.com>
Openpgp: id=98EF9A49
Organization: GPLHost
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.16) Gecko/20110307 Icedove/3.0.11

On 04/11/2011 10:52 AM, Robert Collins wrote:
>>> Also, the fact that Git doesn't do network connections
>>> unless its really needed is very welcome.
>
> bzr shouldn't do network connections except when really needed
> *either* : the world is big and networks are slow, so like other DVCS
> the strong preference it has is to cache data locally and only talk on
> the network when really needed.

It unfortunately does. "bzr launchpad-login" for example does, and if
I'm not mistaking or dreaming, "bzr commit" as well. Using Git, it's not
the case. The issue isn't to cache data, the issue is that a commit
should *never* access any remote data, so that I could work in the train
without connectivity, for example, and still be able to do "bzr commit".
Only pull and push should do network accesses.

> We're desperately short of technical data on the slownesses reported
> from China *specifically*.

I'd be happy to help, but I'm very surprised that you didn't get reports
from Canonical people working in Beijing or Shanghai.

> Things that we'd love to know - how long does SSL handshake take for
> you, do you suffer packet loss talking to our servers, whats the peak
> bandwidth you can get back to our servers.

I can talk for connections from ChinaNet, which is what we have in half
of China, I cannot talk for connectivity using China Unicom (these are
the 2 operators in China, each selling ADSL access to half of the country).

>From here, in Shanghai, I hardly get 8KB/s when I do an initial bzr
branch (the equivalent of a clone in Git). That's max speed I saw,
sometimes it is just stuck, and CURL fails to download, getting half a
SSL packet, and printing a Python stack trace.

The reason is simple: the traceroute goes by Sprint, which I believe has
poor connectivity to China (very few times, I see them in the
traceroutes). If you were getting some connectivity by twtelecom (or
maybe by PCCW (the biggest cable operator in Hongkong, but twtelecom is
better), the situation would be much better. We have connectivity from
twtelecom in Atlanta, and it's really good, much better than what we
have in Seattle by HE.

>  - we have some analysis about performance of push and pull itself
> which the bzr guys are working on, that will go live as soon as they
> cut another release and we upgrade to bzr $thatversion

I was quite satisfied with the performances of pull and push, the
initial "bzr branch lp:xxx" was working at 2MB/s on some of my servers.
That's really good, but *if* you have a connection good enough, which
isn't my case when I want to work on my laptop here.

>  - we're considering an SSL frontend CDN with a node in asia

Not needed. Just get bandwidth from the correct providers (like
twtelecom or PCCW), and it will be acceptable. Adding a cache wont help
much if the cache is badly connected...

>, but its
> not at the very top of the list for performance: we're fixing the
> things that have the most impact - that affect everyone - before we
> start segmenting and improving performance for just one subset of the
> user base.

I'm not talking about *improving* performances, but about simply being
able to barely work with bzr. Can you imagine the frustration when I had
to do 7 times "bzr launchpad-login" until it worked (and of course,
having to wait for the timeout each time)? Currently, doing that on my
laptop with a direct connection to launchpad is nearly impossible, at
peak hours (like 5 or 6 pm local time). For that reason, I've been
working at night (and also to go on IRC and get in touch with people
helping me to understand Openstack as I discover it). So I have to go
around by my servers, which not everyone can do here (not everyone has
dozens of servers all around the world like I do).

>  - the time it takes to deliver the html/json for a page is a key
> metric that we're driving down. 1/2 of the Launchpad developers are
> now in maintenance mode doing performance fixes and customer support.
> I'm completely confident we'll continue to make massive strides on
> this metric in the next 3-6 months. So far, we've dropped the peak
> time - the time the slowest pages in Launchpad take to render - by 9
> seconds (from a peak of 20 seconds).

Frankly, I very rarely do direct connections to websites from here,
because of slowness in China (and simply because I have solutions to
speed-up everything). But that's not the case when I use bzr unless I
use a VPN or something like that, which isn't something I like doing. So
I'm not really the one to ask for the launchpad website performance
"feeling".

> - I've been trying to find a Launchpad user there who can help rule
> out whats making things slow.

Don't search: sprint is the one!!! As I'm writing this mail, it's 11pm,
and I get 20% packet loss... And that's not even peak hours in here
(which is between 5 and 8pm local time). I can send traceroutes with mtr
if you like, but I believe it will be annoying reads for the readers of
this list. Maybe we should switch to private emails?

I hope the above helps,

Thomas Goirand (zigo)

Follow ups

Re: Moving code hosting to GitHub
From: Robert Collins, 2011-04-11
Re: Moving code hosting to GitHub
From: Elliot Murphy, 2011-04-11

References

Moving code hosting to GitHub
From: Jay Pipes, 2011-04-08
Re: Moving code hosting to GitHub
From: Thomas Goirand, 2011-04-10
Re: Moving code hosting to GitHub
From: Elliot Murphy, 2011-04-11
Re: Moving code hosting to GitHub
From: Robert Collins, 2011-04-11