openstack team mailing list archive
-
openstack team
-
Mailing list archive
-
Message #09036
Re: Caching strategies in Nova ...
Alas, I let my patch get too stale to rebase properly. However, it is a
fairly "dumb" approach I took that can be demonstrated just from the
patch. And in any case, I think the approach you're taking, profiling
based on Tach, is going to be better in the long run and more share-able
in the community.
+ 1 gazillion to getting good metrics!
"Sandy Walsh" <sandy.walsh@xxxxxxxxxxxxx> said:
> (resent to list as I realized I just did a Reply)
>
> Cool! This is great stuff. Look forward to seeing the branch.
>
> I started working on a similar tool that takes the data collected from
> Tach and fetches the data from Graphite to look at the performance
> issues (no changes to nova trunk requires since Tach is awesome).
>
> It's a shell of an idea yet, but the basics work:
> https://github.com/ohthree/novaprof
>
> But if there is something already existing, I'm happy to kill it off.
>
> I don't doubt for a second the db is the culprit for many of our woes.
>
> The thing I like about internal caching using established tools is that
> it works for db issues too without having to resort to custom tables.
> SQL query optimization, I'm sure, will go equally far.
>
> Thanks again for the great feedback ... keep it comin'!
>
> -S
>
>
> On 03/22/2012 11:53 PM, Mark Washenberger wrote:
>> Working on this independently, I created a branch with some simple
>> performance logging around the nova-api, and individually around
>> glance, nova.db, and nova.rpc calls. (Sorry, I only have a local
>> copy and its on a different computer right now, and probably needs
>> a rebase. I will rebase and publish it on GitHub tomorrow.)
>>
>> With this logging, I could get some simple profiling that I found
>> very useful. Here is a GH project with the analysis code as well
>> as some nova-api logs I was using as input.
>>
>> https://github.com/markwash/nova-perflog
>>
>> With these tools, you can get a wall-time profile for individual
>> requests. For example, looking at one server create request (and
>> you can run this directly from the checkout as the logs are saved
>> there):
>>
>> markw@poledra:perflogs$ cat nova-api.vanilla.1.5.10.log | python
>> profile-request.py req-3cc0fe84-e736-4441-a8d6-ef605558f37f
>> key count avg
>> nova.api.openstack.wsgi.POST 1 0.657
>> nova.db.api.instance_update 1 0.191
>> nova.image.show 1 0.179
>> nova.db.api.instance_add_security_group 1 0.082
>> nova.rpc.cast 1 0.059
>> nova.db.api.instance_get_all_by_filters 1 0.034
>> nova.db.api.security_group_get_by_name 2 0.029
>> nova.db.api.instance_create 1 0.011
>> nova.db.api.quota_get_all_by_project 3 0.003
>> nova.db.api.instance_data_get_for_project 1 0.003
>>
>> key count total
>> nova.api.openstack.wsgi 1 0.657
>> nova.db.api 10 0.388
>> nova.image 1 0.179
>> nova.rpc 1 0.059
>>
>> All times are in seconds. The nova.rpc time is probably high
>> since this was the first call since server restart, so the
>> connection handshake is probably included. This is also probably
>> 1.5 months stale.
>>
>> The conclusion I reached from this profiling is that we just plain
>> overuse the db (and we might do the same in glance). For example,
>> whenever we do updates, we actually re-retrieve the item from the
>> database, update its dictionary, and save it. This is double the
>> cost it needs to be. We also handle updates for data across tables
>> inefficiently, where they could be handled in single database round
>> trip.
>>
>> In particular, in the case of server listings, extensions are just
>> rough on performance. Most extensions hit the database again
>> at least once. This isn't really so bad, but it clearly is an area
>> where we should improve, since these are the most frequent api
>> queries.
>>
>> I just see a ton of specific performance problems that are easier
>> to address one by one, rather than diving into a general (albeit
>> obvious) solution such as caching.
>>
>>
>> "Sandy Walsh" <sandy.walsh@xxxxxxxxxxxxx> said:
>>
>>> We're doing tests to find out where the bottlenecks are, caching is the
>>> most obvious solution, but there may be others. Tools like memcache do a
>>> really good job of sharing memory across servers so we don't have to
>>> reinvent the wheel or hit the db at all.
>>>
>>> In addition to looking into caching technologies/approaches we're gluing
>>> together some tools for finding those bottlenecks. Our first step will
>>> be finding them, then squashing them ... however.
>>>
>>> -S
>>>
>>> On 03/22/2012 06:25 PM, Mark Washenberger wrote:
>>>> What problems are caching strategies supposed to solve?
>>>>
>>>> On the nova compute side, it seems like streamlining db access and
>>>> api-view tables would solve any performance problems caching would
>>>> address, while keeping the stale data management problem small.
>>>>
>>>
>>> _______________________________________________
>>> Mailing list: https://launchpad.net/~openstack
>>> Post to : openstack@xxxxxxxxxxxxxxxxxxx
>>> Unsubscribe : https://launchpad.net/~openstack
>>> More help : https://help.launchpad.net/ListHelp
>>>
>>
>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~openstack
>> Post to : openstack@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~openstack
>> More help : https://help.launchpad.net/ListHelp
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~openstack
> More help : https://help.launchpad.net/ListHelp
>
diff --git nova/api/openstack/wsgi.py nova/api/openstack/wsgi.py
index af62f9b..70f4474 100644
--- nova/api/openstack/wsgi.py
+++ nova/api/openstack/wsgi.py
@@ -769,6 +769,9 @@ class Resource(wsgi.Application):
LOG.info("%(method)s %(url)s" % {"method": request.method,
"url": request.url})
+ LOG.info("MJW %(method)s %(url)s" % {"method": request.method,
+ "url": request.url},
+ context=request.environ['nova.context'])
# Identify the action, its arguments, and the requested
# content type
@@ -782,8 +785,12 @@ class Resource(wsgi.Application):
# function. If we try to audit __call__(), we can
# run into troubles due to the @webob.dec.wsgify()
# decorator.
- return self._process_stack(request, action, action_args,
- content_type, body, accept)
+ ret = self._process_stack(request, action, action_args,
+ content_type, body, accept)
+ LOG.info("MJW finish %(method)s %(url)s" % {"method": request.method,
+ "url": request.url},
+ context=request.environ['nova.context'])
+ return ret
def _process_stack(self, request, action, action_args,
content_type, body, accept):
diff --git nova/db/api.py nova/db/api.py
index a886945..63bdf69 100644
--- nova/db/api.py
+++ nova/db/api.py
@@ -42,11 +42,13 @@ these objects be simple dictionaries.
pool of available hardware (Default: True)
"""
+import functools
from nova import exception
from nova import flags
from nova.openstack.common import cfg
from nova import utils
+from nova import log as logging
db_opts = [
@@ -73,8 +75,34 @@ db_opts = [
FLAGS = flags.FLAGS
FLAGS.add_options(db_opts)
-IMPL = utils.LazyPluggable(FLAGS['db_backend'],
- sqlalchemy='nova.db.sqlalchemy.api')
+
+LOG = logging.getLogger('nova.db.api')
+
+
+class LogEverything(object):
+
+ def __init__(self, wrapped):
+ self.wrapped = wrapped
+
+ def __getattr__(self, name):
+ original = getattr(self.wrapped, name)
+ if callable(original):
+ return self._log_everything(original, name)
+ else:
+ return original
+
+ def _log_everything(self, fun, name):
+ @functools.wraps(fun)
+ def logged_fun(context, *args, **kwargs):
+ LOG.info("MJW %s" % name, context=context)
+ ret = fun(context, *args, **kwargs)
+ LOG.info("MJW finish %s" % name, context=context)
+ return ret
+ return logged_fun
+
+
+IMPL = LogEverything(utils.LazyPluggable(FLAGS['db_backend'],
+ sqlalchemy='nova.db.sqlalchemy.api'))
class NoMoreNetworks(exception.Error):
diff --git nova/image/__init__.py nova/image/__init__.py
index c5ca6f3..ac047fc 100644
--- nova/image/__init__.py
+++ nova/image/__init__.py
@@ -14,19 +14,47 @@
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
-
+import functools
import nova
from nova import utils
from nova import flags
from nova.image import glance
+from nova import log as logging
FLAGS = flags.FLAGS
+LOG = logging.getLogger('nova.image')
+
+
+class LogEverything(object):
+
+ def __init__(self, *args, **kwargs):
+ if not hasattr(self, 'already_inited'):
+ self.wrapped = args[0]
+ self.already_inited = True
+ else:
+ self.wrapped.__init__(*args, **kwargs)
+
+ def __getattr__(self, name):
+ original = getattr(self.wrapped, name)
+ if callable(original):
+ return self._log_everything(original, name)
+ else:
+ return original
+
+ def _log_everything(self, fun, name):
+ @functools.wraps(fun)
+ def logged_fun(context, *args, **kwargs):
+ LOG.info("MJW %s" % name, context=context)
+ ret = fun(context, *args, **kwargs)
+ LOG.info("MJW finish %s" % name, context=context)
+ return ret
+ return logged_fun
def get_default_image_service():
ImageService = utils.import_class(FLAGS.image_service)
- return ImageService()
+ return LogEverything(ImageService())
def get_image_service(context, image_href):
@@ -48,4 +76,4 @@ def get_image_service(context, image_href):
(glance_client, image_id) = glance.get_glance_client(context,
image_href)
image_service = nova.image.glance.GlanceImageService(glance_client)
- return (image_service, image_id)
+ return (LogEverything(image_service), image_id)
diff --git nova/log.py nova/log.py
index c0bc256..00a203f 100644
--- nova/log.py
+++ nova/log.py
@@ -30,6 +30,7 @@ It also allows setting of formatting information through flags.
"""
import cStringIO
+import eventlet
import inspect
import json
import logging
@@ -49,6 +50,7 @@ from nova import version
log_opts = [
cfg.StrOpt('logging_context_format_string',
default='%(asctime)s %(levelname)s %(name)s [%(request_id)s '
+ '%(eventlet_id)x '
'%(user_id)s %(project_id)s] %(instance)s'
'%(message)s',
help='format string to use for log messages with context'),
@@ -177,6 +179,8 @@ class NovaLogger(logging.Logger):
params['extra'] = {}
extra = params['extra']
+ extra['eventlet_id'] = id(eventlet.getcurrent())
+
context = None
if 'context' in params:
context = params['context']
diff --git nova/rpc/__init__.py nova/rpc/__init__.py
index 7c6ed29..80a7a7c 100644
--- nova/rpc/__init__.py
+++ nova/rpc/__init__.py
@@ -16,11 +16,13 @@
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
+import functools
from nova import flags
from nova.openstack.common import cfg
from nova.rpc.common import RemoteError, LOG
from nova.utils import import_object
+from nova import log as logging
rpc_backend_opt = \
@@ -30,6 +32,29 @@ rpc_backend_opt = \
FLAGS = flags.FLAGS
FLAGS.add_option(rpc_backend_opt)
+LOG = logging.getLogger('nova.rpc')
+
+
+class LogEverything(object):
+
+ def __init__(self, wrapped):
+ self.wrapped = wrapped
+
+ def __getattr__(self, name):
+ original = getattr(self.wrapped, name)
+ if callable(original):
+ return self._log_everything(original, name)
+ else:
+ return original
+
+ def _log_everything(self, fun, name):
+ @functools.wraps(fun)
+ def logged_fun(context, *args, **kwargs):
+ LOG.info("MJW %s" % name, context=context)
+ ret = fun(context, *args, **kwargs)
+ LOG.info("MJW finish %s" % name, context=context)
+ return ret
+ return logged_fun
def create_connection(new=True):
@@ -169,5 +194,5 @@ def _get_impl():
"""Delay import of rpc_backend until FLAGS are loaded."""
global _RPCIMPL
if _RPCIMPL is None:
- _RPCIMPL = import_object(FLAGS.rpc_backend)
+ _RPCIMPL = LogEverything(import_object(FLAGS.rpc_backend))
return _RPCIMPL
References