← Back to team overview

openstack team mailing list archive

Re: [Ops] OpenStack and Operations: Input from the Wild

 

Thanks.  Now I understand the performance metrics you guys were talking
about.  It'd be good if we can have some tool reporting numbers for a cloud
just like 'mpstat', 'iostat' did for a system.

On Mon, Apr 9, 2012 at 3:06 PM, Tim Bell <Tim.Bell@xxxxxxx> wrote:

> Availability metrics for me are ones that allow me to tell if the service
> is up, degraded or down. Each of us as we start production monitoring need
> to work out how many nova, glance and swift processes of which type should
> be running.  Furthermore, we need to add basic ‘ping’ style probes to see
> that the services are responding as expected. ****
>
> ** **
>
> Performance metrics are for cases where we want to record how well the
> system is running. Examples of number of REST calls/second, VMs
> created/second etc.  These are the kind of metrics which feed into capacity
> planning, bottleneck identification, trending.****
>
> ** **
>
> Building up an open, standard and consistent set will avoid duplicate
> effort as sites deploy to production and allow us to keep the monitoring up
> to date when the internals of OpenStack change.****
>
> ** **
>
> Tim****
>
> ** **
>
> *From:* Huang Zhiteng [mailto:winston.d@xxxxxxxxx]
> *Sent:* 09 April 2012 05:42
> *To:* Tim Bell
> *Cc:* David Kranz; Andrew Clay Shafer;
> openstack-operators@xxxxxxxxxxxxxxxxxxx; Duncan McGreggor; openstack
>
> *Subject:* Re: [Openstack] [Ops] OpenStack and Operations: Input from the
> Wild****
>
> ** **
>
> Hi Tim,
>
> Could you elaborate more on 'performance metrics'?  Like what kind of
> metrics are considered as performance ones?  Thanks.****
>
> On Sat, Apr 7, 2012 at 2:13 AM, Tim Bell <Tim.Bell@xxxxxxx> wrote:****
>
>  ****
>
> Splitting monitoring into****
>
>  ****
>
> 1.       Gathering of metrics (availability, performance) and reporting
> in a standard fashion should be part of OpenStack. ****
>
> 2.       Best practice sensors should sample the metrics and provide
> alarms for issues which could cause service impacts. Posting of these
> alarms to a monitoring system should be based on plug ins****
>
> 3.       Reference implementations for standard monitoring systems such
> as Nagios should be available that queries the data above and feeds it into
> the package selected****
>
>  ****
>
> Each site does not want to be involved in defining the best practice.
> Equally, each monitoring system should not have to have an intimate
> understanding of OpenStack to produce a red/green light.  The components
> for 1 and 2 fall under the associated openstack component. Component 3 is
> the monitoring solution provider.****
>
>  ****
>
> Tim****
>
>  ****
>
> *From:* openstack-bounces+tim.bell=cern.ch@xxxxxxxxxxxxxxxxxxx [mailto:
> openstack-bounces+tim.bell=cern.ch@xxxxxxxxxxxxxxxxxxx] *On Behalf Of *David
> Kranz
> *Sent:* 06 April 2012 16:44
> *To:* Andrew Clay Shafer
> *Cc:* openstack-operators@xxxxxxxxxxxxxxxxxxx; openstack; Duncan McGreggor
> *Subject:* Re: [Openstack] [Ops] OpenStack and Operations: Input from the
> Wild****
>
>  ****
>
> This is a really great list! With regard to cluster health and monitoring,
> I did a bunch of stuff with Swift before turning to nova and really
> appreciated the
> way each swift service has a "healthcheck" call that can be used by a
> monitoring system. While I don't think providing a production-ready
> monitoring system should be part of core OpenStack, it is the core
> architects who really know what needs to be checked to ensure that a system
> is healthy. There are various sets of poking at ports, process lists and so
> on that Crowbar, Zenoss, etc. set up but it would be a big improvement for
> deployers if each openstack service provided healthcheck apis based on
> expert knowledge of what is supposed to be happening inside. That would
> also insulate deployers from changes in the code that might impact what it
> means to be running properly. Looking forward to the discussion.
>
>  -David
>
>
>
> On 4/6/2012 1:06 AM, Andrew Clay Shafer wrote: ****
>
> Interested in devops.****
>
>  ****
>
> Off the top of my head.****
>
>  ****
>
> live upgrades****
>
> api queryable indications of cluster health****
>
> api queryable cluster version and configuration info****
>
> enabling monitoring as a first class concern in OpenStack (either as a
> cross cutting concern, or as it's own project)****
>
> a framework for gathering and sharing performance benchmarks with
> architecture and configuration****
>
>  ****
>
>  ****
>
> On Thu, Apr 5, 2012 at 1:52 PM, Duncan McGreggor <duncan@xxxxxxxxxxxxx>
> wrote:****
>
> For anyone interested in DevOps, Ops, cloud hosting management, etc.,
> there's a proposed session we could use your feedback on for topics of
> discussion:
>  http://summit.openstack.org/sessions/view/57
>
> Respond with your thoughts and ideas, and I'll be sure to add them to the
> list.
>
> Thanks!
>
> d
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp****
>
>
>
>
> ****
>
> _______________________________________________****
>
> Mailing list: https://launchpad.net/~openstack****
>
> Post to     : openstack@xxxxxxxxxxxxxxxxxxx****
>
> Unsubscribe : https://launchpad.net/~openstack****
>
> More help   : https://help.launchpad.net/ListHelp****
>
>  ****
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp****
>
>
>
>
> --
> Regards
> Huang Zhiteng****
>



-- 
Regards
Huang Zhiteng

Follow ups

References