openstack team mailing list archive
-
openstack team
-
Mailing list archive
-
Message #09081
Re: Caching strategies in Nova ...
-
To:
<openstack@xxxxxxxxxxxxxxxxxxx>
-
From:
Sandy Walsh <sandy.walsh@xxxxxxxxxxxxx>
-
Date:
Fri, 23 Mar 2012 20:13:31 -0300
-
In-reply-to:
<CAKe5d-RUd99WOjn6ZUs9cf6BtRv7Bvf6J5aAU1ZbNH1z=8TQgg@mail.gmail.com>
-
User-agent:
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.27) Gecko/20120216 Thunderbird/3.1.19
Great suggestions guys ... we'll give some thought on how the community
can share and compare performance measurements in a consistent way.
-S
On 03/23/2012 07:26 PM, Joe Gordon wrote:
> +1
>
> Documenting these findings would be nice too.
>
>
> best,
> Joe
>
> On Fri, Mar 23, 2012 at 2:15 PM, Justin Santa Barbara
> <justin@xxxxxxxxxxxx <mailto:justin@xxxxxxxxxxxx>> wrote:
>
> This is great: hard numbers are exactly what we need. I would love
> to see a statement-by-statement SQL log with timings from someone
> that has a performance issue. I'm happy to look into any DB
> problems that demonstrates.
>
> The nova database is small enough that it should always be in-memory
> (if you're running a million VMs, I don't think asking for one
> gigabyte of RAM on your DB is unreasonable!)
>
> If it isn't hitting disk, PostgreSQL or MySQL with InnoDB can serve
> 10k 'indexed' requests per second through SQL on a low-end (<$1000)
> box. With tuning you can get 10x that. Using one of the SQL bypass
> engines (e.g. MySQL HandlerSocket) can supposedly give you 10x
> again. Throwing money at the problem in the form of multi-processor
> boxes (or disks if you're I/O bound) can probably get you 10x again.
>
> However, if you put a DB on a remote host, you'll have to wait for a
> network round-trip per query. If your ORM is doing a 1+N query, the
> total read time will be slow. If your DB is doing a sync on every
> write, writes will be slow. If the DB isn't tuned with a sensible
> amount of cache (at least as big as the DB size), it will be
> slow(er). Each of these has a very simple fix for OpenStack.
>
> Relational databases have very efficient caching mechanisms built
> in. Any out-of-process cache will have a hard time beating it.
> Let's make sure the bottleneck is the DB, and not (for example)
> RabbitMQ, before we go off a huge rearchitecture.
>
> Justin
>
>
>
>
> On Thu, Mar 22, 2012 at 7:53 PM, Mark Washenberger
> <mark.washenberger@xxxxxxxxxxxxx
> <mailto:mark.washenberger@xxxxxxxxxxxxx>> wrote:
>
> Working on this independently, I created a branch with some simple
> performance logging around the nova-api, and individually around
> glance, nova.db, and nova.rpc calls. (Sorry, I only have a local
> copy and its on a different computer right now, and probably needs
> a rebase. I will rebase and publish it on GitHub tomorrow.)
>
> With this logging, I could get some simple profiling that I found
> very useful. Here is a GH project with the analysis code as well
> as some nova-api logs I was using as input.
>
> https://github.com/markwash/nova-perflog
>
> With these tools, you can get a wall-time profile for individual
> requests. For example, looking at one server create request (and
> you can run this directly from the checkout as the logs are saved
> there):
>
> markw@poledra:perflogs$ cat nova-api.vanilla.1.5.10.log | python
> profile-request.py req-3cc0fe84-e736-4441-a8d6-ef605558f37f
> key count avg
> nova.api.openstack.wsgi.POST 1 0.657
> nova.db.api.instance_update 1 0.191
> nova.image.show 1 0.179
> nova.db.api.instance_add_security_group 1 0.082
> nova.rpc.cast 1 0.059
> nova.db.api.instance_get_all_by_filters 1 0.034
> nova.db.api.security_group_get_by_name 2 0.029
> nova.db.api.instance_create 1 0.011
> nova.db.api.quota_get_all_by_project 3 0.003
> nova.db.api.instance_data_get_for_project 1 0.003
>
> key count total
> nova.api.openstack.wsgi 1 0.657
> nova.db.api 10 0.388
> nova.image 1 0.179
> nova.rpc 1 0.059
>
> All times are in seconds. The nova.rpc time is probably high
> since this was the first call since server restart, so the
> connection handshake is probably included. This is also probably
> 1.5 months stale.
>
> The conclusion I reached from this profiling is that we just plain
> overuse the db (and we might do the same in glance). For example,
> whenever we do updates, we actually re-retrieve the item from the
> database, update its dictionary, and save it. This is double the
> cost it needs to be. We also handle updates for data across tables
> inefficiently, where they could be handled in single database round
> trip.
>
> In particular, in the case of server listings, extensions are just
> rough on performance. Most extensions hit the database again
> at least once. This isn't really so bad, but it clearly is an area
> where we should improve, since these are the most frequent api
> queries.
>
> I just see a ton of specific performance problems that are easier
> to address one by one, rather than diving into a general (albeit
> obvious) solution such as caching.
>
>
> "Sandy Walsh" <sandy.walsh@xxxxxxxxxxxxx
> <mailto:sandy.walsh@xxxxxxxxxxxxx>> said:
>
> > We're doing tests to find out where the bottlenecks are,
> caching is the
> > most obvious solution, but there may be others. Tools like
> memcache do a
> > really good job of sharing memory across servers so we don't
> have to
> > reinvent the wheel or hit the db at all.
> >
> > In addition to looking into caching technologies/approaches
> we're gluing
> > together some tools for finding those bottlenecks. Our first
> step will
> > be finding them, then squashing them ... however.
> >
> > -S
> >
> > On 03/22/2012 06:25 PM, Mark Washenberger wrote:
> >> What problems are caching strategies supposed to solve?
> >>
> >> On the nova compute side, it seems like streamlining db
> access and
> >> api-view tables would solve any performance problems caching
> would
> >> address, while keeping the stale data management problem small.
> >>
> >
> > _______________________________________________
> > Mailing list: https://launchpad.net/~openstack
> > Post to : openstack@xxxxxxxxxxxxxxxxxxx
> <mailto:openstack@xxxxxxxxxxxxxxxxxxx>
> > Unsubscribe : https://launchpad.net/~openstack
> > More help : https://help.launchpad.net/ListHelp
> >
>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack@xxxxxxxxxxxxxxxxxxx
> <mailto:openstack@xxxxxxxxxxxxxxxxxxx>
> Unsubscribe : https://launchpad.net/~openstack
> More help : https://help.launchpad.net/ListHelp
>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack@xxxxxxxxxxxxxxxxxxx
> <mailto:openstack@xxxxxxxxxxxxxxxxxxx>
> Unsubscribe : https://launchpad.net/~openstack
> More help : https://help.launchpad.net/ListHelp
>
>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~openstack
> More help : https://help.launchpad.net/ListHelp
References