← Back to team overview

launchpad-dev team mailing list archive

Re: headsup: upcoming changes to oops-*

 

On Wed, Nov 16, 2011 at 11:59 PM, John Arbash Meinel
<john@xxxxxxxxxxxxxxxxx> wrote:
> I'm curious if you benchmarked bson vs json vs whatever the current
> serializer is. I know I've heard a lot of statements that "bson is
> faster", but it came up on the U1 mailing list and so I went ahead and
> benchmarked it.
>
> json was encoded using simplejson.loads/dumps. It also supports having
> a list as the top level object, so I made a string of 50k objects and
> wrapped "[" + ",\n".join(lines) + "]".
> I also compared it to decoding each entry one-by-one.
>
> My summary was:
>
>        encode_all      encode_by_1     decode_all      decode_by_1
> json    0.766           1.130           0.511           0.795
> bson                    0.873                           0.515
>
> Basically, at best bson is 1.5:1 faster than simplejson, but doing the
> operation in bulk for simplejson was even faster than bjson. For both
> encoding and decoding.

Which bson encoder / decoder did you use? The mongodb one or the one
on pypi? [idly curious]

> This was 50k music metadata records, about 33MB of data in total.
>
> Also, I tried just writing the bson data out to a file, though I'm not
> sure how you are supposed to write a list of bson records (maybe wrap
> it in an object and put it as a value in a list?)

bson requires a 'document' top level: {'mylist': [....]}.

> I happen to really like than BSON is type-length-prefixed, but I
> wonder if JSON wouldn't have had better interoperability.

json doesn't handle non-unicode, which is an issue as we get some
-real- crap in on HTTP, and also its very nice that bson supports
datetime directly - we record a lot of datetimes ;)

-Rob


Follow ups

References