Thread Previous • Date Previous • Date Next • Thread Next |
Hi Gregor, Le 06/12/2013 12:21, Gregor Trefs a écrit : >> 1) if we want ObjectID sizes to depend on the indexed type (this >> is the case), we need something slightly more complicated. My take >> would be to have all indexable classes to implement an Indexable >> interface with a preferredSize method which returns the number of >> bytes needed. Then we will have ObjectId<T extends Indexable> or >> something like this. > No. Just let the Object ID be so simple from the outside as it is > ;). If you want to make the ObjectID sizes dependent on the type > information, then you might use the StrategyPattern. I'm puzzled, isn't that a factory pattern? (I'm not being pedantic, it's just that it does not strike me as different from a factory.) > A bit more detail: > > 1. Class ObjectIdFactory This class has several creation methods for > the different classes (e.g. createMultiChunckId(byte[] array)). > Within such a method, you are aware for which type you create the > ID. However, ObjectId should not be aware of this fact. Well, that's more complicated in my opinion. It means that the logic for creating an Id is neither in the ObjectId nor in the object to identify. This seems too much flexible and a maintenance burden (but this below for more). > Also, MultiChunckRef should not know nothing more about his > identifier than its pure existince. That's a bit more complicated (see the discussion on the Chunks with Philipp). A PartialFileHistory cannot contribute much in term of choosing its id, as there is no simple deterministic scheme that would work in this context (file content change, names also, etc.). So we resort on a random id, which is a general facility that can be provided by ObjectId (or a factory, see below). But in the case of a Chunk, we have a natural candidate for the id, the chunk checksum and we have no reason to do something else. So Chunks know how to create their id, PartialFileHistorys don't. > One solution is, to have an abstract Strategy class whose instance > know about the type (e.g. class of MultiChunkRef), id structure and > how to best take care of ID related actions (e.g. compare with other > ID). Why moving that to an abstract factory, if we can do it with a concrete one? > A little draft code: > > public class ObjectId <T> { private final IdStrategy<T> stratgey; > > ... public void equals(ObjectId other){ ... > if(strategy.eqauls(other)){ return true; } } > > } Like I said, one of the design objective is to be able to use memory tight ids. If one keeps a pointer to a strategy or to a class in each instance of an Id, one wastes 4 or 8 bytes (depending on the 32/64 bits flavor of the JVM and whether the 64bits reference can be packed in this case). I think we can do better. I would do something like this: - an interface/abstract class Id<T> - at least two concrete classes: - ShortId<T> for FileId and other objects for which a 16 bytes random id is enough (based on the UUID experience, this is very adapted in many cases) - ArrayId<T> for objects who longueur ids (like content based id for Chunks) - both concrete classes have proper hashcode, equals and toString methods, but none use T in those methods - a factory IdFactory which provides: - convenience methods for creating random ids of a given length, for turning a String into an id and vice versa (the latter being used in the toString) - possibly, as you propose, factory methods for each indexable type What is still no clear in my head is whether the factory should be generic and based on properties described in an Indexable interface (then we will have Id<T extends Indexable>) or more pragmatic with as many methods are concrete indexable types. On a philosophical point of view, I prefer the first solution, but it seems a bit overengineered. Also, it needs an existing indexable object before creating an id which will no be always super convenient and flexible. In any case, we have type safety as an Id<A> is never an Id<B> (if A and B are unrelated), which makes taking the T type into account useless at the equals level, if I'm not mistaken. We have memory efficiency and we don't need to multiply the number of concrete Id type by the number of concrete indexable type. Seems to be quite nice. Cheers, Fabrice
Thread Previous • Date Previous • Date Next • Thread Next |