Thanks for the write-up! I really like the abstraction you’ve put forward. It (almost) all makes sense to me. There are some subtle points that I’d like to bring to focus.
Above, a transition x \to y is defined as two partitions, one for x\backslash y, and another for y\backslash x. That is, two disjoint sequence of sets \{U_i\}_{i\leq k},\{W_i\}_{i\leq k}.
What if instead we use covers (i.e. not necessarily disjoint sequences) of x\backslash y and y\backslash x? I think they will also do the job. I mean, we can still pack independent state transitions into a “paralel” transition.
The reason for requiring partitions and not covers might be what @cwgoes said:
In my view, this design might be too limiting. Too limiting with respect two goals:
- Parallelization
- Tasks with preprocessing
Parallelization. The goal is to parallelize an algorithm between different nodes or threads (different logic scopes), and then pack their results into a single transaction.
There are tasks that cannot be distributed if we do not allow persistables to participate in different logic scopes. In general, the design won’t work for any problem in NC whose input is not partitioned.
For example, computing a matrix multiplication \mathbf C = \mathbf A\mathbf B can be parallelized by computing the inner product of every pair of row, column. Each row \mathbf a_i in \mathbf A participates in as many inner products as columns \mathbf b_j in \mathbf B. If you see rows and columns as persistables of kind ‘vector’, then there is no way to split n-dimensional matrix multiplication into disjoint logic scopes, each in charge of a subset of the inner products. The only possible partition we have is \langle U, W\rangle, where the consumed persistables are U = \{\mathbf a_i, \mathbf b_i\}_{i\leq n}, and the created persistables are W = \{\mathbf c_i\}_{i\leq n}. In other words, we are forced to deal with persistables of kind ‘matrix’ as a whole.
This brings me to the second point I’d like to drawn attention to.
Tasks with preprocessing. What if we want to do reactive computations? Namely, algorithms that re-use (part of) the state without modifying it. Tasks that need some sort of pre-processing would fall into this category. To capture this type of tasks, I think the domain of the logic P_v should be X \cup Y, so we also account for X \cap Y–the unchanged portion of the state for this transition.
For example, following with linear algebra. For a given set of vectors \mathcal B = \{\mathbf v_1, \ldots, \mathbf v_n\} we’d like to check linear combinations of them, but only if the elements in \mathcal B are independent. The algorithm P is then
We could first prove the n vectors in \mathcal B are independent, and add them to the state as persistables of kind ‘linear independent vectors’ in transition X \to Y. Later on, we add vector \mathbf v to the state as a persistable of kind ‘linear dependent’, in transition Y \to Z. With the current design, we cannot re-use \mathcal B, it must be re-created, or copied, every time (and its linear independence proved in each copy).
In a related post I discussed these two notions (with different terminology). Therein, the conclusion (as I got it, @vveiln) was that we can always capture parallelization and preprocessing by passing resources as application data. Fair enough. However, it feels to me that application data is some sort of ‘miscellaneous’ category, that comes to fix anything the abstraction may miss out. I do like really your abstraction, and feel it should be able to capture more.