← Back to team overview

winterize-users team mailing list archive

Re: Architecture is moving forward

 

On 2/7/2014 2:02 PM, Jens Koeplinger wrote:

The A(n) = B(m) + 1 primitive is a great example building block of problems that could be computed in Winter more efficiently than otherwise. Thinking about a specific implementation for A, B, and respective generations m, n upon which some kind of result is computed (+1), this could be a pool of actors { A, B, C, ... } executing pairwise computations where the question is for the effective average result. Say, e.g., your actors { A, B, C, ... } would be different computers with same hardware configuration and OS, and they interact through one another by playing different chess programs or algorithms each. A Winter program could instruct the actors doing so, and calculate effective win/loss ratings. Winning against another actor who has one more often in the past will give a higher score as opposed to winning against an actor who has only lost so far. All calculations are done pairwise, at any time, and there is no requirement for a common or synchronized state of the overall system. Of course you can request a readout of scores from all actors at any given time, and you'll get a more precise answer the more generations have passed. In this very specific example, the result would be a fairly representative score of strength for the various programs. Winter would help coding the entire process.


Of course this is an extreme example with very highly structured actors; I'm sure there are better ones where the A, B, ... are primitives but still something meanigfull can be calculated.


The pairwise is a great notion, high-performance calculations. Dedicated processing nodes. Nice.

Not sure if I'm going away from that, but one problem is letting anyone make connections between lots of pages. More and more dependencies. Sprawl. How to update lots of related pages? So I thought about how a database does it.

In a database, multiple changes that need to be mutually consistent are performed at the same time in a transaction, so that all the changes are done at the same time, or none of them. Later on, someone can do a database query and obtain several values in a table join, and the values are mutually consistent. One value will never be "ahead" of the others as multiple changes are made. It's all or nothing.

The wiki can't work like that when one page is asserted to be a function applied to another page. Someone can save a new value for a page. Things that depend on it might now be updated yet. Inconsistent.

But I had envisioned that to get as close as possible, we let the wiki run leisurely and update when it can, some here, some there, with intermediate results that are not correct with respect to each other. The key then is the concept of a generation of page recalculations of what's out of date.

Almost like:
1. begin database transaction
2. recalculate all pages that are out of date
3. commit the transaction

Nobody can see the new data until we commit the transaction.

It's not perfect since as I pointed out, you can't get TWO resources in ONE get in the http protocol, the equivalent of an SQL query that does a join. You must always risk getting a new value of one and the old value of the other. So to reduce that risk, I had come up with the plan to allow you to get a specific generation of information of one thing, then get that same generation of information of the other thing afterward (two http gets for example). The values are correct with respect to each other no matter if we crossed a generation boundary or not.

As a nice side effect, *every* page recalculation can be done in parallel! That will allow scaling of processing power as you add new machines to your processing grid. The timing of new saves of data won't matter since nobody sees anything until you hit the generation boundary.

References