← Back to team overview

acmeattic-devel team mailing list archive

Re: Modifications to versioning system proposal

 

Let me summarise the versioning system proposal so far:

  1. Forwards diffs are kept. So if the various versions of a file F
     are F1, F2, F3, etc, then the server stores: F1, d(F1,F2), d(F2,
     F3), ...
  2. Because of storing forward diffs, the first checkout to a client
     will require a large download to get to the latest copy of a file.
     To decrease the size of the download, we could decide to store the
     whole file at a point, instead of a diff. This would look like (on
     the server), F1, d(F1,F2), d(F2,F3), F4, d(F4, F5), etc. A simple
     metric to decide when to store the whole file would be to see if
     the size of F1 + d(F1,F2) + d(F2, F3) is greater than F4.
  3. All diffs and files are sent encrypted to the server.
  4. Text files can use normal line based diff algorithms. Binary files
     won't generate good diffs with such algorithms. Instead we can use
     an rsync-like algorithm [1] for binary files. It will generate
     better diffs. For early releases, we can compromise to just store
     the full files of successive revisions for binary files.
  5. All this does seem like we need a custom version management
     module, instead of reusing one like hg, git or bzr. But I think
     performance and flexibility-wise, we are better off writing our
     own. We can always borrow code from these other s/w.
  6. The number of versions, etc to store is still not decided. We
     could perhaps discuss this more. Issues related to this:
        1. The server sees only encrypted data. So it will not be able
           to process diff files to compact, etc, at least with the
           current encryption scheme of using AES.
        2. Clients could do compaction, before sending versions to the
           server.
        3. Clients can also tell server to discard certain older
           versions, when it decides to store a full file.
        4. The frequency of storing revisions should be flexible later
           on, but for quick early release we can compromise.

Hope a better discussion can be had now. Please resply inline, if you are going to respond to each point individually.

--
Aditya.

[1]: http://samba.anu.edu.au/rsync/tech_report/


Follow ups

References