← Back to team overview

openstack team mailing list archive

Re: [CHEF] How to structure upstream OpenStack cookbooks?


On Sat, Mar 10, 2012 at 5:49 PM, Andrew Clay Shafer
<acs@xxxxxxxxxxxxxxxx> wrote:
> some response inline followed by general comments on the topic
> On Sat, Mar 10, 2012 at 9:30 AM, andi abes <andi.abes@xxxxxxxxx> wrote:
>> I like where this discussion is going. So
>> I'd like to throw a couple more sticks into the fire, around test/SAIO
>> vs production deployments..
> I think there are some misconceptions that lead to problems.
> If your SAIO/test diverge from production deployments, what are you really
> testing?

there's a big difference, imho, between testing code integration and
production deployments.
e.g. if you're trying to write a functional test for swift, that
requires the proxy server to connect to the account, container and
object servers - it'd be pretty handy to have a SAIO deployment (e.g.
to test the new brimring).

Are you trying to say that all deployments should ever be the same,
less they're meaningless?

> If you really understand what configuration management tools are for, the
> recipes are code that is going through the same cycle.
>> * Swift cookbooks (and in general) should not assume control of system
>> side resources, but rather use the appropriate cookbook  (or better
>> yet "definition" if it exists). e.g rsync might be used for a variety
>> of other purposes - by other roles deployed to the same node. The
>> rsync (not currently, but hopefully soon) cookbook should provide the
>> appropriate hooks to add your role's extras. Maybe a better example is
>> the soduers cookbook, which allows node attributes to describe users &
>> groups.
> I agree in principle.
> This is good practice in general, but in specific can require a lot of
> understanding and discipline to separate the recipes.
>> * SAIO deployments could probably be kept really simple if they don't
>> have to deal with repeated application - no need to worry about
>> idempotency which tends to make things much harder. A greenfield
>> deployment + some scripts to ""operate"" the test install are probably
>> just the right thing.
> If you find yourself writing chef with the idea that you don't need to worry
> about idempotency, you aren't doing configuration management, you are
> writing an installer. You are almost better off using bash. Part of my
> position here also stems from my position that these recipes shouldn't be
> different.
> You might think I'm being zealot, and relatively speaking, you might be
> right, but from my experience the extra work to do this upfront pays itself
> back many times over. Also, I can show you where the real zealots live. :)

I actually agree with you. My comments were specific to the proposed
swift- cookbook. It do mostly what the online instruction provide (and
used much of the bash shell code verbatim), with chef/ruby code added
to attempt and handle idempotency.

My point was that for SIAO, you don't really need config
management.... and an "installer" as you call it is probably
Sorry if I wasn't clear.

>> * Configurability - in testing, you'd like things pretty consistent.
>> One pattern I've been using is having attribute values that are
>> 'eval'ed to retrieve the actual data.
>> For example - the IP address/interface to use for storage
>> communication (i.e. proxy <-> account server)  a node attribute called
>> "storage_interface" is evaluated. A user (or higher level system) can
>> assign either "node[:ipaddress]" (which is controlled by chef, and
>> goes slightly bonkers when multiple interfaces are present) or be more
>> opinionated and use e.g
>> "node[:crowbar][:interfaces][:storage_network]"
> This gets to the heart of one of the reasons why there is so much variation
> in the wild.
> chef allows for almost infinite flexibility for how information gets into
> the system and how it gets used.
> Further, there is a tension between specificity and discovery.
> By that I mean one can either specify a specific value, in cookbooks, roles
> or databags, or you can discover values from the running systems.
> This is a spectrum and there is not always a 'right' answer. Both have value
> and both can become problematic. Context, opinions and philosophy of the
> author are typically what determine the details of a given cookbook.
I'll get to your example in a bit (next comment below). But the
example I used was trying to get at a different issue.
In both cases i described, the information is "discovered" in your
terminology. Chef reports the IP address (or crowbar - which just
moves it per interface),
The issue is - when a given piece of information can be retrieved from
multiple (apparently equivalent) places - how do you choose which one
to use? And what does it take to modify the choice?
The eval approach allows modifying an attribute (rather than a recipe)
to change the source of the discovered data (the IP address to use).
This could help reduce the changes a potential user of a cookbook
might need to make.

> I'll attempt to make this more concrete. How do you want to deploy and
> manage the filesystems and devices for rings of Swift?
> One on end of the spectrum, parameterize an attribute with all the devices
> that will be in the ring, with the obvious alternative being that some
> procedure and convention gets the devices from the running systems.

funny.. there are cookbooks out there for swift that match your
spectrum to a tee ;)
The Crowbar cookbooks discover the disks (with a bit of filtering)
The Voxel ones have a prescribed data bag, which determines the ring's content.

> This is also a tension between knowing and doing, and leads to a bunch of
> other questions.
> In the case where a specified device is missing/failed, what should the
> behavior be? If doing discovery, is there a method to sanity check what the
> devices should be?
> Both cases only get more complicated when considering on going management of
> the cluster, adding/removing capacity etc.

The crowbar philosophy re devices in the ring is: add as discovered.
remove only if the node is removed from chef. This allows for
operators to decide how to recover from failure conditions.

> Operational scenarios start begging the question of what should be managed
> with chef at all. (they also beg the question of whether there should be
> some more automated ring management in Swift itself)

that could be a bit dangerous - a brief failure in e.g. connectivity
to a node holding 30TB of data should probably not trigger an
automatic removal of the node...

> This is just one example, but I hope it illustrates the point.
> My current personal preferences/bias:
> Do not make 'installers' with configuration management tools
> Stand alone recipes should not be separate. If they do exist, it is as part
> of the initial pass to get a working cookbook with the intent to refactor to
> more generalize cookbook. AIO should then be a role
> lean towards specificity supported by tooling to manage the metadata for
> node specific configuration
> utilize discovery for cross node configuration (of the data that was
> specified for the other nodes)
 total in sync so far.

> I'm not saying I'm right, just that I've seen and tried things a few
> different ways and this seemed to work best in my context.
> So to answer Jay's explicit questions:
> 1) Do resources that set up non-production environments such as Swift
> All-in-One belong in the OpenStack Chef upstream cookbooks?
> I vote no. Not as a cookbook. As a role, maybe.
> 2) Should the cookbook be called "swift" instead of "swift-aio", with the
> idea that the cookbook should be the top-most container of resources
> involved with a specific project?
> Assuming you allow aio to be separate, this is a better organization IMHO.
> If the AIO can't be just a role parameterization, the proximity hopefully
> encourages more modularity and reuse.
> 3) Is it possible to have a "swift" cookbook and have resources underneath
> that allow a user to deploy either SAIO *or* into a multi-node production
> environment? If so, would the best practice be to create recipes for SAIO
> and recipes for each of the individual Swift servers (proxy, object, etc)
> that would be used in a production configuration?
> Possible. Recipes can be composed. There is clearly one way to do all this,
> but getting to some combination of modular recipes parameterized by roles is
> going to be better than a series of disparate monolithic cookbooks in the
> long run. (even if it seems like more work now)
> 4) Instead of having an SAIO recipe in a swift cookbook, is it more
> appropriate to make a Chef *role* called swift-aio that would have a run
> list that contained a number of recipes in the swift cookbook for all the
> Swift servers plus rsync, loopback, etc?
> In my opinion, this is more appropriate.
> 0.02 + interest
> Andrew

Follow ups