← Back to team overview

acmeattic-devel team mailing list archive

Re: [AcmeAttic] Design doc: Plan of action

 

On 11 July 2010 21:35, Karthik Swaminathan Nagaraj <nkarthiks@xxxxxxxxx>wrote:

>
>>    - Solidify revision, encryption schemes for the first alpha release.
>>    Eg: I vote for skipping Snapshotting, provisions for revision merging and
>>    similar performance features. We would already need a lot of infrastructure
>>    set up to get to Release 1.
>>
>>  I am not yet sure what you mean by snapshotting. Do you mean that for the
>> first release, we can simply keep daily revisions? That is, keep full copies
>> of the file for each day? (Of course, no copy on day 2 if there is no change
>> from day 1's copy).
>>
> Yes. Just keep adding revisions without cleaning them up. (and ofcourse
> just the diff's)
>

So you actually meant keeping diffs instead of copies. I thought
snapshotting meant keeping full copies. Yeah I suppose initially we could
use either. I think we will go with keeping diffs.


>
>>
>>    - Programming tools: Eg. I vote for tools like Protocol buffers,
>>    Python Twisted framework, pycrypto library (More on the Wiki [1])
>>
>>  I think we should try to reduce the dependencies on the s/w if we are to
>> make it cross-platform. So, I propose that we should integrate pycryto code
>> inside our software. Twisted should be available on most platforms (I
>> think), so we don't have to worry
>>
> Twisted is available for all platforms (Linux, Windows, OS X). pyCrypto is
> also available for all platforms.
> However, I am not completely convinced of the integration. I would really
> love to hear from the corporate guys on this. (Lets blow the Vuvuzela to
> call them!)
> We could definitely build binaries with the dependencies, but I would like
> to keep our codebase small. Integration is a release plan right? Does it
> affect our development?
>

Integrating pycrypto would probably not increase the codebase much. We will
be stripping it down to bare necessities, but Suren or at least some one
else should probably comment on this. Only we seem to be the ones actively
discussing things.


>
>
>> about that. It would also be quite a challenge to integrate that into our
>> s/w. I am still a bit ambivalent about Protocol buffers. I don't think we
>> really need it. It is meant to be usable across programming languages, etc.
>> We don't really need it. I think that a simpler way to exchange messages
>> between server/client is to use pickling [1]. We can define a message
>> structure that contains all the fields we require, and pickle it. We can
>> wrap the pickled structure in TCP and send it.
>>
> I know how Pickle and cPickle works. One of the most important disadvantage
> of Picke and similar serialization techniques is backward compatibility.
> Suppose we update the object contents to add another field, all of the
> servers and clients have to be updated with the new version - otherwise
> Pickle crashes [1]. This is an important advantage of Protocol buffers.
>

The pickle documentation page does not seem to say when it will crash. Can
you be more specific? Unpickling an older version object will simply give an
object having a different set of attributes than the version that the newer
software is expecting - this need not crash anything right? This can be
handled in code itself keeping backwards compatibility in mind.


> However, the protocol buffers implementation in Python is pretty slow. I
> would advocate a serialization format such as ProtoBuf, XML, JSON, Thrift,
> etc.
>
>    - Protobuf - Really good C++ version. Python version is really slow. We
>    can hope for Google to improve this. Maybe Satya can tell us more about the
>    status of this.
>    - XML - too bulky and slow.
>    - JSON - Less bulkier than XML, but still uses ASCII encoding
>    - Thrift - uses binary encoding (similar to protobufs). Released out of
>    Facebook. New and under development.
>
> Assuming pickling is really a bad choice (which I am not sure of), I think
we should concentrate on performance in alternative ways. XML and JSON we
should definitely avoid. Can someone run a speed comparison between thrift
and protobuf?

Follow ups

References