← Back to team overview

zeitgeist team mailing list archive

Re: [Bug 494288] Re: "apriori": get most used (websites/notes/documents/etc...)

 

2009/12/9 Mikkel Kamstrup Erlandsen <mikkel.kamstrup@xxxxxxxxx>

> I don't think we should consider open/close events when calculating
> these relations. That way it wont work for contacts and other non-file-
> like items.
>
> The initial step of the algorithm: "Fetch the last 7 events for this
> subject uri" seems good.
>
> The next step where you create a time range neighbourhood around each of
> these events, is a bit unclear to me... You create the neighbourhood as
> (event.timestamp, <next_event_timestamp>). This seems odd at a glance.
> Why not (event.timestamp - delta, event.timestamp + delta) ?
>

please exlpain ur delta

The reason why i went with this neighbourhood generation is:
imagine
*
*
*x*, 1, *x*, 3, 8, 1, 2, *x*, 5, 4, *x*, 2, 6, 7

if i took the next 7 events i get

[*x*, 1, *x*, 3, 8, 1, 2, *x*]
[*x*, 3, 8, 1, 2, *x*, 5, 4]
[*x*, 5, 4, *x*, 2, 6, 7]
[*x*, 2, 6, 7]

this has lot of overlapping

however what i do by figuring out the ranges between 2 x is to allow me to
get...

[x,1] [x,3,8,1,2] [x,5,4] [x,2,6,7]

makes sense?


>
> Next thing is that I think you can do the two last steps of the
> algorithm in one SQL query. Ie. the parts where you create the k_tuples
> and the part where you calculate the support of the k_tuples. Possibly:
>
> SELECT subj_uri, count(subject_uri)
> FROM event_view
> WHERE (timestamp > ? AND timestamp < ?) OR (timestamp > ? timestamp < ?) OR
> (...) ...
> GROUP BY subj_uri
> ORDER BY timestamp ASC
> LIMIT 5
>
> I am sure Siegfried can do this even better though :-D
>
> --
> "apriori": get most used (websites/notes/documents/etc...)
> https://bugs.launchpad.net/bugs/494288
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in Zeitgeist Framework: New
>
> Bug description:
> We have a branch with the 1-step apriori algorithm built.
> Right now it throws out the most used items with another item
> We should make it configurable to be able to ask for most used
> interpretations of items with other items
> This way we can for example ask for most used "websites" with document X
> etc....
> what do u think?
>
> To unsubscribe from this bug, go to:
> https://bugs.launchpad.net/zeitgeist/+bug/494288/+subscribe
>
>

-- 
"apriori": get most used (websites/notes/documents/etc...)
https://bugs.launchpad.net/bugs/494288
You received this bug notification because you are a member of Zeitgeist
Framework, which is the registrant for Zeitgeist Framework.

Status in Zeitgeist Framework: New

Bug description:
We have a branch with the 1-step apriori algorithm built. 
Right now it throws out the most used items with another item
We should make it configurable to be able to ask for most used interpretations of items with other items
This way we can for example ask for most used "websites" with document X
etc....
what do u think?





References