← Back to team overview

acmeattic-devel team mailing list archive

Re: [AcmeAttic] Design doc: Plan of action

 

On Sun, Jul 11, 2010 at 9:58 PM, krishnan parthasarathi <
krishnan.parthasarathi@xxxxxxxxx> wrote:

>
>
>>>>
>>>>    - Programming tools: Eg. I vote for tools like Protocol buffers,
>>>>    Python Twisted framework, pycrypto library (More on the Wiki [1])
>>>>
>>>>  I think we should try to reduce the dependencies on the s/w if we are
>>>> to make it cross-platform. So, I propose that we should integrate pycryto
>>>> code inside our software. Twisted should be available on most platforms (I
>>>> think), so we don't have to worry
>>>>
>>> Twisted is available for all platforms (Linux, Windows, OS X). pyCrypto
>>> is also available for all platforms.
>>> However, I am not completely convinced of the integration. I would really
>>> love to hear from the corporate guys on this. (Lets blow the Vuvuzela to
>>> call them!)
>>> We could definitely build binaries with the dependencies, but I would
>>> like to keep our codebase small. Integration is a release plan right? Does
>>> it affect our development?
>>>
>>
>> Integrating pycrypto would probably not increase the codebase much. We
>> will be stripping it down to bare necessities, but Suren or at least some
>> one else should probably comment on this. Only we seem to be the ones
>> actively discussing things.
>>
> The main disadvantage of 'integrating' pycrypto into our codebase is that
> we must maintain that piece of code as well. We must keep track of
> pycrypto's development cycle to do this. Our dependency of pycrypto is like
> any other s/w dependency. This needs to be resolved in packaging of
> software. eg tarball, .deb, .rpm, .egg, etc
>
>
>>
>>>
>>>
>>>> about that. It would also be quite a challenge to integrate that into
>>>> our s/w. I am still a bit ambivalent about Protocol buffers. I don't think
>>>> we really need it. It is meant to be usable across programming languages,
>>>> etc. We don't really need it. I think that a simpler way to exchange
>>>> messages between server/client is to use pickling [1]. We can define a
>>>> message structure that contains all the fields we require, and pickle it. We
>>>> can wrap the pickled structure in TCP and send it.
>>>>
>>> I know how Pickle and cPickle works. One of the most important
>>> disadvantage of Picke and similar serialization techniques is backward
>>> compatibility. Suppose we update the object contents to add another field,
>>> all of the servers and clients have to be updated with the new version -
>>> otherwise Pickle crashes [1]. This is an important advantage of Protocol
>>> buffers.
>>>
>>
>> The pickle documentation page does not seem to say when it will crash. Can
>> you be more specific? Unpickling an older version object will simply give an
>> object having a different set of attributes than the version that the newer
>> software is expecting - this need not crash anything right? This can be
>> handled in code itself keeping backwards compatibility in mind.
>>
> The messages are internal to our software and we can ensure we don't have
> our server and client software advance at different speeds in their message
> format compatibility.
>
Is that possible easily. We are surely going to have clients who update
their software later than how fast servers are updated, and our product
releases. I think its a safe choice to have compatibility now.

>
>>
>>> However, the protocol buffers implementation in Python is pretty slow. I
>>> would advocate a serialization format such as ProtoBuf, XML, JSON, Thrift,
>>> etc.
>>>
>>>    - Protobuf - Really good C++ version. Python version is really slow.
>>>    We can hope for Google to improve this. Maybe Satya can tell us more about
>>>    the status of this.
>>>    - XML - too bulky and slow.
>>>    - JSON - Less bulkier than XML, but still uses ASCII encoding
>>>    - Thrift - uses binary encoding (similar to protobufs). Released out
>>>    of Facebook. New and under development.
>>>
>>> Assuming pickling is really a bad choice (which I am not sure of), I
>> think we should concentrate on performance in alternative ways. XML and JSON
>> we should definitely avoid. Can someone run a speed comparison between
>> thrift and protobuf?
>>
> I did the following test. I pickled a class with two members. I tried
> unpickling it onto a different class and received an AttributeError
> execption (not a crash). These exceptions can be handled when we have
> backward compatibility issues. Now that this is cleared, I think we can go
> ahead with pickling as our message serialization format.
>
Yes Pickle throws out an exception. Pickle is really meant to be a simple
python library for serialization, and is not meant to handle such
compatibility issues. Damn, I even had to search for a while before I landed
on proper examples for Classes - it looks like its meant for simple objects
such as dictionaries.
Supposing there is a client running old software, the server will not be
able to push out messages before the client class is updated. Backward
compatibility will allow us to quickly push out new features without having
to worry about the whole world updating.

Btw, Thrift has very bad documentation up right now. From what I read
online, it looks like Thrift and Protobuf is almost neck to neck. But some
claim that Protobufs generate smaller dumps.

>
>
> cheers,
> krishnan
>
> _______________________________________________
> Mailing list: https://launchpad.net/~acmeattic-devel<https://launchpad.net/%7Eacmeattic-devel>
> Post to     : acmeattic-devel@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~acmeattic-devel<https://launchpad.net/%7Eacmeattic-devel>
> More help   : https://help.launchpad.net/ListHelp
>
>


-- 
Karthik

Follow ups

References