beeseek-devs team mailing list archive
-
beeseek-devs team
-
Mailing list archive
-
Message #00063
Re: DEV meeting minutes 2008-02-04
I'd like to add that, in prospective, we could also evaluate the
possibility to have a "peer-crawling", using peers as crawlers too.
Simone
2008/2/4 Emanuele Rampichini <emanuele_rampichini@xxxxxxxx>:
> 1 - Overview Of Done Work.
> ADDED BY Emanuele "lele85" Rampichini
> SPEAKER Andrea "andrea-bs" Corbellini
>
> After a small discussion about what was done by Andrea Corbellini, we had decided to spend time and work for a base documentation. The focus is on making a detailed documentation in small time. In this way any developer will be able to create a module for hive (or other code) without read the internals.
>
> Assigned to this task:
> Andrea "andrea-bs" Corbellini
> Gianluigi "Wing_Zero" Biancucci
>
> 2 - Architecture Of P2P System
> ADDED BY Andrea "andrea-bs" Corbellini
> SPEAKERS Developer Team
>
> We can summaraze the discussion in 3 possible choices:
>
> 1) CLUSTER BASED NETWORK
> + Very High Reliability from the very beginning (linear dependence between reliability and superpeer number)
> + Speed (Routing in a well known network is easy and fast)
>
> - Expensive (We need to pay for 24/7 online superpeers with high bandwidth and disk space)
> - Impossibility to use cluster suddivision of data. (Internet queries are too heterogeneous)
>
> 2) PURE P2P NETWORK
> + Easy implementation (In a pure P2P architecture every node is equal)
> + It's good for a p2p distribuited heterogeneous database
> + Low costs (scalable network with no costs)
>
> - Low Reliability (A good reliability only with very large number of peers)
> - Low Speed (Routing in a pure P2P network can be very very slow)
>
> 3)HYBRID NETWORK
> + Good Reliability from the beginning
> + Much faster than a pure P2P network (Faster routing based on a well known superpeer list)
> + When we have a "critic mass" we can put down superpeers without killing the entire net.
> + Medium cost (whe need initial reliable superpeers but we have an high scalable network)
>
> - Very difficult to design (lot of thinghs to think about)
> - Very difficult to implement
> - Medium cost :D (It's not so big but is a cost!)
>
> Using a simple poll the idea of HYBRID NETWORK seems to be the best choice for beeseek project.
>
> 3 - DBMS Approach vs Plain Text or XML.
>
> With a large agreement we have chosen XML for data rappresentation. A DBMS ( like MySQL, PostgreeSGL etc..) has too many high level functions that we will never use. On the other side plain text has no function. We decided XML because is standard, easily extendible and there are query language for it like Xquery.
>
> 4 - Google Problem.
>
> The first idea was collecting links and data using google. That's not possible, so we had to find alternatives:
>
> + alexa
> + crawler
> + human immission :D
> + grub (open distribuite crawler)
>
> Assigned to this task:
> Andrea "warp10" Colangelo
>
> The entire log of the discussion is in attachment.
>
>
>
>
> ___________________________________
> L'email della prossima generazione? Puoi averla con la nuova Yahoo! Mail: http://it.docs.yahoo.com/nowyoucan.html
> _______________________________________________
> Beeseek-devs mailing list
> Beeseek-devs@xxxxxxxxxxxxxxxxx
> http://lists.beeseek.org/listinfo.cgi/beeseek-devs-beeseek.org
>
>
--
Ing. Simone Brunozzi
Via del Volontariato, 22 - 06083 Bastia Umbra (PG) - ITALY
Cell. +39 392-1551977 / +39 340-5768488
---------------------------------------
www.ubuntista.it | www.nonovvio.it
http://www.linkedin.com/in/simonebrunozzi
---------------------------------------
References