syncany-team team mailing list archive

Thread
Date

Re: Long ids

To: Gregor Trefs <gregor.trefs@xxxxxxxxx>, Syncany Mailing List <syncany-team@xxxxxxxxxxxxxxxxxxx>
From: Philipp Heckel <philipp.heckel@xxxxxxxxx>
Date: Thu, 5 Dec 2013 20:37:47 +0100
In-reply-to: <CAEyuBZuws7wED3Au4UvTNzsZbHK4Dpa9FOYs1HhK0a7PK-Yq+w@mail.gmail.com>

Hi Gregor,

you forgot to hit "reply all" again :-)
I'll leave everything in there, so that everyone can read what you wrote

On Thu, Dec 5, 2013 at 10:17 AM, Gregor Trefs <gregor.trefs@xxxxxxxxx>wrote:

> Hi Philipp, hi Fabrice,
>
> I read your mails yesterday and what I in particular did not understood
> was that imposing of a type hierachy over an ID and the corresponding Use
> Case:
>
> >> (then you cannot mistakingly search for a file using a chunk id, for
> instance)
>
> I don't get myself used to the idea to use the ID to distinguish different
> class types (e.g. Multichunk, FileContent). If an instance of  a class has
> an ID, then the type information is already encoded in the instance. Under
> the assumption (and I have the impression) that you want to use the ID as a
> standalone concept (e.g. storing in an Index to avoid fetching all
> multichunk information) then you may use a weak reference from the id to
> your instance.
>

Although I did not fully understand what you wrote here, I hope this
captures what you meant: You're wondering why we need an extra ObjectId
class (and FileId, MultiChunkId, ...), because you *assume* that the ID
cannot appear anywhere in the code without a corresponding instance (e.g. a
FileHistory carries its own ID, so there is no need for a special class
FileId).

However, this assumption is wrong. IDs can appear in database files and
code as references to the actual object.

Example:
The <multichunk> tag contains <chunkRef> tags, thereby referencing a Chunk
--> this translates to ChunkId, not Chunk

The reason for this indirect referencing in the XML is clear: A chunk can
be referenced in a certain database file, but it does not need to appear in
this particular database version. Instead, it could have been initially
appeared in a previous database version:

<databaseVersion>
  <chunks>
    <chunk id="ab1fdf" size="512" />
  </chunks>
</databaseVersion>

<databaseVersion>
  <multiChunks>
    <multiChunk>
       <chunkRef ref="ab1fdf" />
    </multiChunk>
  </chunks>
</databaseVersion>

To carry over this indirection to the code is not so clear, because in
theory, these references could be resolved: if you encounter a <chunkRef>
in the database, query it and get the corresponding Chunk object:
database.getChunk(new ChunkId("ab1fdf"))

However, this cannot always be done, because sometimes, there is only a
partial database in the memory, so having a reference of the object is a
better (and more memory-conserving way. Using a ChunkId, FileId, etc. is
just a way to make sure that you cannot select a Chunk using a FileId, so
database.getChunk(new FileId("ab1fdf")) is not possible.

You would be able to stay typesafe without imposing an own inheritance
> structure upon ObjectID which I assume to be rather flat. This satisfies
> point a. For point b and c I thought about UTF which has this dynamic size
> concept (as far as I remember). Only use that amount of memory that is
> really needed. This is a trade-off beteen a flexible size and a low memory
> footprint.
>

You totally lost me here! What's a, b, c?

>
> Regards
> Gregor
>

Best regards,
Philipp

Follow ups

Re: Long ids
From: Fabrice Rossi, 2013-12-05

References

Long ids
From: Fabrice Rossi, 2013-12-03
Re: Long ids
From: Philipp Heckel, 2013-12-04