← Back to team overview

openstack team mailing list archive

Re: Performance diagnosis of metadata query

 

Well, there are diablo-stable packages. If Ubuntu, Debian, Red Hat, etc. keep hearing from customers that Essex in an LTS release is not adequate, there will be essex-stable packages too. They are the ones who have to stand behind the product. It is perfectly understandable that there is resistance to putting anything other than fixes for critical bugs in a week or so from release. I am not saying this is great, but if release dates are fixed and features, performance the things that are allowed to vary, then what else is there to do? Just my opinion.

 -David

On 3/29/2012 1:55 PM, Justin Santa Barbara wrote:
I'm not saying it can't be rationalized; I'm saying it is frustrating to me.

My understanding is that Essex is going to be baked into both Ubuntu & Debian for the long term - 5 years plus. That's a long time to have to keep explaining why X is broken; I'd rather just fix X.



On Thu, Mar 29, 2012 at 10:22 AM, David Kranz <david.kranz@xxxxxxxxxx <mailto:david.kranz@xxxxxxxxxx>> wrote:

    On 3/29/2012 12:46 PM, Justin Santa Barbara wrote:

        Is there a good way to map back where in the code these calls
        are coming from?


    There's not a great way currently.  I'm trying to get a patch in
    for Essex which will let deployments easily turn on SQL debugging
    (though this is proving contentious); it will have a configurable
    log level to allow for future improvements, and one of the things
    I'd like to do is add later is something like a stack trace on
    'problematic' SQL (large row count, long query time).  But
    that'll be in Folsom, or in G if we don't get logging into Essex.

    In the meantime, it's probably not too hard to follow the code
    and infer where the calls are coming from.  In the full log,
    there's a bit more context, and I've probably snipped some of
    that out; in this case the relevant code is get_metadata in the
    compute API service and get_instance_nw_info in the network service.

         Regardless, large table scans should be eliminated,
        especially if the table is mostly read, as the hit on an
        extra index on insert will be completely offset by the
        speedups on select.


    Agreed - some of these problems are very clear-cut!

    It does frustrate me that we've done so much programming work,
    but then not do the simple stuff at the end to make things work
    well.  It feels a bit like shipping we're shipping C code which
    we've compiled with -O0 instead of -O3.


    Well, in a project with the style of fixed-date release
    (short-duration train-model) that openstack has, I think we have
    to accept that there will never be time to do anything except
    fight critical bugs "at the end". At least not until the project
    code is much more mature. In projects I have managed we always
    allocated time at the *beginning* of a release cycle for fixing
    some backlogged bugs and performance work. There is less pressure
    and the code is not yet churning. It is also important to have
    performance benchmark tests to make sure new features do not
    introduce performance regressions.

     -David

    _______________________________________________
    Mailing list: https://launchpad.net/~openstack
    <https://launchpad.net/%7Eopenstack>
    Post to     : openstack@xxxxxxxxxxxxxxxxxxx
    <mailto:openstack@xxxxxxxxxxxxxxxxxxx>
    Unsubscribe : https://launchpad.net/~openstack
    <https://launchpad.net/%7Eopenstack>
    More help   : https://help.launchpad.net/ListHelp




Follow ups

References