← Back to team overview

drizzle-discuss team mailing list archive

Re: Stewart Smith: AlsoSQL

 

Sorry - I made a few mistakes my mail. Most notably, the first query
does not retrieve the genre - the second query does that. Also, the
last query asks for all directors that directed a film starring Al
Pacino and Robert de Niro (not just films starring these actors)

On Thu, Apr 8, 2010 at 1:33 AM, Roland Bouman <roland.bouman@xxxxxxxxx> wrote:
> Hi Stewart!
>
> On Wed, Apr 7, 2010 at 11:00 PM, Planet Drizzle
> <emailer@xxxxxxxxxxxxxxxxx> wrote:
>> A flexible query language is rather useful though. I think we’ll see the
>> rise of AlsoSQL. That is systems that present a fast and simple protocol
>> along with a SQL interface.
>
> Aye to query languages :)
>
>> This hybrid system has seen use for many years. MySQL Cluster is one such
>> example. SQL through MySQL Server, NoSQL through NDB API.
>
> Does NDB support a query language? Perhaps I am mistaken but I though
> it was a kind of API, where you have to pretty much tell the machine
> exactly what it should do.
>
>> With Drizzle, I feel we’ll be in a pretty good position to offer non-sql
>> based protocols and access methods to existing storage engines.
>
> I have been looking at Freebase lately (http://www.freebase.com/docs/data)
> I think it is a pretty interesting project for a number of reasons.
> One of them is that it has what seems to me a pretty fully featured
> query language.
> It's called MQL, and it allows query by example, subqueries (but the
> feature could also be seen as a join, depending on how you look at
> it), aggregation (although limited), sorting, pattern matching and
> more.
>
> The query language is available through a RESTful webservice (not sure
> if it's really REST, they claim so) and is a JSON vocabulary.
> (see: http://www.freebase.com/docs/web_services/mqlread)
>
> It is a pretty natural fit for (Ajax) web applications. Here's a
> simple query to retreive all films directed by Francis Ford Coppola,
> along with their genre:
>
> [{
>  "type": "/film/film",
>  "directed_by": "Francis Ford Coppola",
>  "name": null,
> }]
>
> Basically, this query asks for all films directed by Francis Ford Coppola
>
> This is the result:
>
> [
>  {
>    "directed_by": "Francis Ford Coppola",
>    "name": "Apocalypse Now",
>    "type": "/film/film"
>  },
>  {
>    "directed_by": "Francis Ford Coppola"
>    "name": "The Conversation"
>    "type": "/film/film"
>  },
>  ....many more objects...
> ]
>
> (If you want to try it out for yourself, use the query editor:
> http://www.freebase.com/app/queryeditor)
>
>
> As you can see, whatever we specify (type, directed_by) is matched and
> returned, whatever we leave null is filled in.
>
> Filling in the blanks also works for collections (genre in the
> following snippet):
>
> [{
>  "type": "/film/film",
>  "directed_by": "Francis Ford Coppola",
>  "name": null,
>  "genre": []
> }]
>
> Result:
>
> [
>  {
>    "directed_by": "Francis Ford Coppola",
>    "name": "Apocalypse Now",
>    "type": "/film/film",
>    "genre": ["Cult","War","Existentialism","New Hollywood"]
>  },
>  ....many more objects...
> ]
>
> This is also how you do joins. Suppose we want to get the names and
> characters of the actors:
>
> [{
>  "type": "/film/film",
>  "directed_by": "Francis Ford Coppola",
>  "name": null,
>  "genre": [],
>  "starring": [{"actor": null, "character": null}],
> }]
>
> Result is:
>
> [
>  {
>    "directed_by": "Francis Ford Coppola",
>    "name": "Apocalypse Now",
>    "type": "/film/film",
>    "genre": ["Cult","War","Existentialism","New Hollywood"],
>    "starring": [
>        {
>            "actor": "Marlon Brando",
>            "character": "Colonel Walter E. Kurtz"
>        },
>        {
>            "actor": "Robert Duvall",
>            "character": "Lt. Colonel William Kilgore"
>       },
>       {
>           "actor": "Martin Sheen",
>           "character": "Captain Willard"
>       }
>        ....many more actors...
>     ]
>   },
>   ...many more films...
> ]
>
> Actually, the difference with the earlier "genre" property is
> superficial: "genre" is also a type in its own right, and we could
> have gotten objects back for that too. But because we didn't ask for
> the genre to be returned as objects (we didn't use the curly braces)
> MQL automagically does the right thing and returns it as a list of
> scalars.
>
> Another, really, really nice thing IMO is that MQL can do something
> SQL can't, or only with difficulty. Genre and Actor are both details
> of film, and MQL has no difficulty returning them in the result with a
> single query. Try that in SQL - you can't: in for example the sakila
> database, a join like:
>
> FROM film
> INNER JOIN film_actor USING (film_id)
> INNER JOIN film_category USING (category_id)
>
> would explode into a semi-cartesian product in case you have multiple
> actors and multiple categories for one film. You can only work around
> it by doing two queries, or by doing some nasty multi-level grouping.
> (first join film to actors, group by film and serialize actors with
> group_concat - then join to categories, group by and group concat to
> serialize the categories.)
>
> As you can see, this query language totally obliterates the need for a
> object relational mapper - the applications can simply decide it
> themselves by writing the query according to the structure they want
> at the app level. In my book, that is a big win.
>
> There are a few tricks if the queries get more complex. For example,
> it seems natural to query all movies starring both Robert de Niro and
> Al Pacino with this query:
>
> [{
>  "type": "/film/director",
>  "name": null,
>  "film": [{
>    "name": null,
>    "starring": [{"actor": "Robert de Niro"}, {"name": "Al Pacino"}]
>  }]
> }]
>
> but this doesn't work. There is a reason for that, but this mail is
> already too long so you better look that up yourself. This is the
> right way to do it though:
>
> [{
>  "type": "/film/director",
>  "name": null,
>  "film": [{
>    "name": null,
>    "a:starring": {
>        "actor": "Robert de Niro",
>        "character": null
>    },
>    "b:starring": {
>        "actor": "Al Pacino",
>        "character": null
>    }
>  }]
> }]
>
> IMO, still a pretty reasonable syntax for a pretty complex query.
>
> Finally, I'd like to point out that although Freebase (or actually the
> underlying Metaweb database) is not relation, it is a graph database.
> But that hasn't prevented them from still maintaining pretty well
> defined schemas: the "type" attribute is basically a table as it
> defines a bag of properties (and relationships). But because one
> entitity can have many different types, and you can access properties
> of all associated types from within a single object notation, it still
> feels just like a so-called "schemaless" database in that respect.
>
> Anyway, I figured I just throw this in - I'd love to see this MQL
> language (and the webservice) slapped unto a relational database. I
> have some ideas for the mapping but I am sure other people see how
> this would work out.
> I am currently trying to build a quick and dirty PHP script to do this
> as a proof of concept.
> If anybody is interested in doing this natively in drizzle, I'd love
> to help out. (I fear I don't have the skills to jump in and do it on
> my own)
>
> Oh, final thing: there will be a freebase meetup in San Franscisco, April 14.
> http://www.meetup.com/sf-freebase/calendar/12845548/
> I am currently trying to figure out if it will be worth travelling
> from the MySQL uc to SF for that evening, but if other people are
> interested in going, perhaps we can join and go together?
>
> Thanks and kind regards,
>
> Roland.
>
>>
>> URL: http://www.flamingspork.com/blog/2010/04/08/alsosql/
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~drizzle-discuss
>> Post to     : drizzle-discuss@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~drizzle-discuss
>> More help   : https://help.launchpad.net/ListHelp
>>
>>
>
>
>
> --
> Roland Bouman
> http://rpbouman.blogspot.com/
>
> Author of "Pentaho Solutions: Business Intelligence and Data
> Warehousing with Pentaho and MySQL",
> http://www.wiley.com/WileyCDA/WileyTitle/productCd-0470484322.html
>



-- 
Roland Bouman
http://rpbouman.blogspot.com/

Author of "Pentaho Solutions: Business Intelligence and Data
Warehousing with Pentaho and MySQL",
http://www.wiley.com/WileyCDA/WileyTitle/productCd-0470484322.html



References