Thread Previous • Date Previous • Date Next • Thread Next |
Jaap, I am trying to figure out, what is the best storage solution for small changes, so far there are two suggestions: 1. single file with separated patches and integrated timestamps (in progress) 2. zip archive, which contains patches / files (name of the file represents timestamps) solution nr. 1: - index is not possible in this solution, only a chain of patches - special characters and following timestamps are used to separate the patches - it is easier to append new patches to end of the file - the file could be tracked by the main VCS (since it is text) solution nr. 2: - possibility to keep index file, which increases lookup of patches - timestamps are stored in single files in the zip file - there is involved compression of the timeline - it is easier to handle list all patches even in file browser Probably the best approach is the first version. Additionally I do have a question, whether it is fine to use other code, which is not licenced under GPL. Particularly it concerns suitable python module licensed under Apache License, Version 2.0: http://code.google.com/p/google-diff-match-patch/ The attachments contain: 1. diff_match_patch.py (python code, which generates patches and reconstructs text) 2. page.timeline (proposed storage of changes with timestamps) 3. testing of concept.py (code, which utilizes diff_match_patch.py and serves as a prove of concept) JK On 6 June 2013 15:41, Jaap Karssenberg <jaap.karssenberg@xxxxxxxxx> wrote: > On Tue, Jun 4, 2013 at 11:41 PM, NorfCran <norfcran@xxxxxxxxx> wrote: > >> Yes, I do not want to replace the text files with XML (the concept based >> on TXT files is the most flexible in my opinion). You are right, attaching >> additional meta information in a shadow file is the only way. >> >> Since you have asked for use cases, I am going to model some of them >> separately for a notebook and for a single page, in order to illustrate >> utilization of Time Stamped Text (TST): >> >> *notebook* >> >> 1. search the most recently modified pages by a time range >> - hierarchical structures like wiki are dynamic and often changes >> happen on many pages, so why not to preserve this flow determined by time >> in its natural form? >> - ordinary search with many matching pages may be filtered out by >> a time range, which is almost possible only very generally based on ctime >> and mtime in the zim's database >> >> *page (main utilization of TST)* >> >> 1. highlighting up to date changes by smooth versions stored in TST >> data structure >> - most up to date changes are highlighted for instance by a red >> color and it fades to black (so it is easier to see, in case of >> modifications and revisions) >> 2. provides possibility to revert changes by performing undo and redo >> any time even though the text buffer is no longer available >> 3. time is a natural binder for any other activity performed in >> parallel to the note taking process → for instance searching information on >> the web (traceable from browser's history) >> >> One of my projects attached to the email researches a graph >> data-structure among other solutions capable of storing timestamps per word >> chunks (as a lowest granularity) separated by spaces. The data-structure is >> based on graphs and it has been implemented, but it is still not robust >> enough for all cases. Maybe it can bring some additional understanding and >> further direction of our discussion. >> > > OK, I would plan something like that as follow: > > 1/ Come up with a compact patch / diff like format that we can use to > store small changes in a "journal file" next to the source file) > 2/ Hook up a plugin to write such a patch file and update on each > auto-save action > 3/ Add an API to the plugin such that we can > a/ query timestamp for a specific piece of text in a specific file > b/ can request timestamps for each part of the current version of the > file > c/ request previous / next change for a given file > 4/ Extend the search function to use API part "a" and add a column to the > search dialog > --> fulfills first use case > > 5/ Extend the page view to use API part "b" to highlight recent changes > 6/ Hook up the undo/redo-manager > a/ to use API part "c" to extend undo in the past > b/ send data to the plugin per change as they happen, instead of > waiting for auto-save > --> fulfills 2nd use case > > Probably the quickest result for a first result would be if you look into > step 1-3a and I hack in step 4. If that is working I'm willing to help with > steps 5 & 6 to integrate into the editor widget. > > Regards, > > Jaap > >
Attachment:
diff_match_patch.py
Description: Binary data
Attachment:
page.timeline
Description: Binary data
Attachment:
testing of concept.py
Description: Binary data
Thread Previous • Date Previous • Date Next • Thread Next |