Thread Previous • Date Previous • Date Next • Thread Next |
On 29 Nov 2011 - 17:54, Aaron Lee wrote: > For this I think we need to separate the models from the queries. +1 > A query method should be able to populate as many of the models as > needed to return the data collected. I also think separating the > queries from the models themselves will help us make the storage > engine replaceable, and will allow us to extract common behavior into > the models themselves instead of peppered throughout the codebase. Couldn't agree more. d > Aaron > > On Nov 29, 2011, at 10:31 AM, Chris Behrens wrote: > > > e) sounds good as long as we don't remove the ability to joinload up > > front. Sometimes we need to join. Sometimes we can be lazy. With > > 'list instances with details', we need to get all instances from the > > DB and join network information in a single DB query. Doing 1 + > > (n*x) DB queries to support a 'nova list' will not be acceptable > > when it can be done in 1 query today. That means we'll need > > instance['virtual_interfaces'] (or similar) to work in cases where > > we 'join up front'. Most cases when we're pulling instances from > > the DB, we don't need to join anything. It'd be more efficient to > > not join network information for those calls. This is where e) is > > fine. > > > > - Chris > > > > On Nov 29, 2011, at 9:20 AM, Soren Hansen wrote: > > > >> Hi, guys. > >> > >> Gosh, this turned out to be long. Sorry about that. > >> > >> I'm adding some tests for the DB api, and I've stumbled across something > >> that we should probably discuss. > >> > >> First of all, most (if not all) of the various *_create methods we have, > >> return quite amputated objects. Any attempt to access related objects > >> will fail with the much too familiar: > >> > >> DetachedInstanceError: Parent instance <Whatever at 0x4f5c8d0> is not > >> bound to a Session; lazy load operation of attribute 'other_things' > >> cannot proceed > >> > >> Also, with the SQLAlchemy driver, this test would pass: > >> > >> network = db.network_create(ctxt, {}) > >> network = db.network_get(ctxt, network['id']) > >> > >> instance = db.instance_create(ctxt, {}) > >> self.assertEquals(len(network['virtual_interfaces']), 0) > >> db.virtual_interface_create(ctxt, {'network_id': network['id'], > >> 'instance_id': instance['id']} > >> > >> self.assertEquals(len(network['virtual_interfaces']), 0) > >> network = db.network_get(ctxt, network['id']) > >> self.assertEquals(len(network['virtual_interfaces']), 1) > >> > >> I create a network, pull it out again (as per my comment above), verify > >> that it has no virtual_interfaces related to it, create a virtual > >> interface in this network, and check the network's virtual_interfaces > >> key and finds that it still has length 0. Reloading the network now > >> reveals the new virtual interface. > >> > >> SQLAlchemy does support looking these things up on the fly. In fact, > >> AFAIK, this is its default behaviour. We just override it with > >> joinedload options, because we don't use scoped sessions. > >> > >> My fake db driver looks stuff like this up on the fly (so the > >> assertEquals after the virtual_interface_create will fail with that db > >> driver). > >> > >> So my question is this: Should this be > >> > >> a) looked up on the fly, > >> b) looked up on first key access and then cached, > >> c) looked up when the parent object is loaded and then never again, > >> d) or up to the driver author? > >> > >> Or should we do away with this stuff altogether? I.e. no more looking up > >> related objects by way of __getitem__ lookups, and instead only allow > >> lookups through db methods. So, instead of > >> network['virtual_interfaces'], you'd always do > >> db.virtual_interfaces_get_by_network(ctxt, network['id']). Let's call > >> this option e). > >> > >> I'm pretty undecided myself. If we go with option e) it becomes clear to > >> consumers of the DB api when they're pulling out fresh stuff from the DB > >> and when they're reusing potentially old results. Explicit is better > >> than implicit, but it'll take quite a bit of work to change this. > >> > >> If we go with one of options a) through d), my order of preference is > >> (from most to least preferred): a), d), c), b). > >> > >> There's value in having a right way and a wrong way to do this. If it's > >> either-or, it makes testing (as in validation) more awkward. I'd say > >> it's always possible to do on-the-fly lookups. Overriding __getitem__ > >> and fetching fresh results is pretty simple, and, as mentioned earlier, > >> I believe this is SQLAlchemy's default behaviour (somebody please > >> correct me if I'm wrong). Forcing an arbitrary ORM to replicate the > >> behaviour of b) and c) could be incredibly awkward, and c) is also > >> complicated because there might be reference loops involved. Also, > >> reviewing correct use of something where the need for reloads depends on > >> previous use of your db objects (which might itself be conditional or > >> happen in called methods) sounds like no fun at all. With d) it's > >> pretty straightforward: "Do you want to be sure to have fresh responses? > >> Then reload the object. Otherwise, behaviour is undefined." It's > >> straightforward to explain and review. Option e) is also easy to > >> explain and do reviews for, btw. > >> > >> DB objects with options a) through d) will have trouble making it > >> through a serializer. After dict serialisation, the object isn't going > >> to change, unresolved related objects will not be resolved, and > >> prepopulating everything prior to serialisation is out of the question > >> due to the potential for loops and very big sets of related objects (and > >> their related objects and their related objects's related objects, > >> etc.). I think it would be valuable to not have to think a whole lot > >> about whether a particular db-like object is a "real" db object or > >> whether it came in as a JSON object or over the message bus or whatever. > >> Only option e) will give us that, AFAICS. > >> > >> It seems I've talked myself into preferring option e). It's too much > >> work to do on my own, though, and it's going to be disruptive, so we > >> need to do it real soon. I think it'll be worth it, though. > >> > >> -- > >> Soren Hansen | http://linux2go.dk/ > >> Ubuntu Developer | http://www.ubuntu.com/ > >> OpenStack Developer | http://www.openstack.org/ > >> > >> _______________________________________________ > >> Mailing list: https://launchpad.net/~openstack > >> Post to : openstack@xxxxxxxxxxxxxxxxxxx > >> Unsubscribe : https://launchpad.net/~openstack > >> More help : https://help.launchpad.net/ListHelp > > > > > > _______________________________________________ > > Mailing list: https://launchpad.net/~openstack > > Post to : openstack@xxxxxxxxxxxxxxxxxxx > > Unsubscribe : https://launchpad.net/~openstack > > More help : https://help.launchpad.net/ListHelp > > > _______________________________________________ > Mailing list: https://launchpad.net/~openstack > Post to : openstack@xxxxxxxxxxxxxxxxxxx > Unsubscribe : https://launchpad.net/~openstack > More help : https://help.launchpad.net/ListHelp
Thread Previous • Date Previous • Date Next • Thread Next |