AVM → transaction function compilation

The aim of this post is to explicate a bit of my current expectation for how AVM → transaction function compilation should work (in the context of overall AVM → RM compilation). I don’t have an exact model worked out, just a high-level pattern which I think makes sense (open to counterarguments).

Let’s take as our basic frame the compilation of AVM programs to writes (as in, when we want to make state changes somewhere in the database). State changes are ultimately represented as transactions, which transaction functions can be used to generate. The read side of the question (a.k.a. indexing) is also important, and yet unaddressed, but is not the topic of this post.

So, somehow, we should be able to take an AVM program (perhaps in the context of some application definitions) and compile it to transaction function(s) and resource logic(s). How exactly? Let’s think about a simple program – starting with a message sent to O1, which then calls O2, which in turn calls O3, and then the return path back:

Assuming that “O2” and “O3” are, in some fashion, object references, this program involves lookup, when O1 looks up which version of O2 (represented as a resource) to call (consume), and O2 looks up which version of O3 (represented as a resource) to call (consume). We then face the question of where, in our distributed system, this lookup should happen. This is a choice which is independent of the high-level application semantics. Suppose that we compile this program to a transaction function which performs the lookups (and then returns a transaction with the appropriate state changes). This transaction function could be run locally (performing the lookups through scry calls with reference to the latest known state of the controller of the objects in question), or it could be run in post-ordering execution on the controller itself. Running the transaction function on the controller post-ordering guarantees that the lookup results will be current (i.e. no conflicts), but may be more expensive, and doesn’t guarantee the exact results of program execution, which the user may wish to do. Running the transaction function locally is likely cheaper (since the controller need only verify the transaction) and guarantees the exact results (since the transaction will revert and no state changes take place if the objects have changed in the meantime), but also means that there could be a conflict if the objects in question are changed between the time when the user runs the transaction function locally and the time when the controller validates the transaction.

We actually have a more granular choice: when we “start running” a program somewhere in the distributed system, for every lookup we encounter, we can decide whether to perform that lookup locally or encode it in a transaction function to be sent onwards. In a diagram, this might look something like:

In this example, partial execution takes place in three places: on node A, which finalizes the changes to object O2, node B, which finalizes the changes to object O3, and the controller, which finalizes the changes to objects O4 and O1 (and executes the transaction enacting the actual state changes, assuming that there are no conflicts). At each step, the results of execution so far need to be encoded (in a Transaction) and the execution to be continued needs to be passed on (in a TransactionFunction). We would probably benefit from a somewhat more sophisticated type here (this topic has been discussed before).

It should be possible to extend this model naturally to solving, where execution simply cannot take place on a node which doesn’t know of a matching intent (and where the results of execution are indeterminate, modulo the specified constraints).

Thoughts? Paging @graphomath @Lukasz @l4e21 @mariari

2 Likes

AVM programs are inspired by the programs from this paper. In particular, each program from the paper should ideally have an AVM counterpart. In the paper, there is nothing else besides objects, invocations of methods on objects whose bodies are programs, and values returned.[1] For the sake of the present post, I think, we can safely assume that the first batch of AVM applications to have an object semantics.[2]

The central idea of the paper is to design objects such that programs and method bodies are the very same kind of entity. As an example, consider the below body of the method inc on a counter object c, which is a program (that can be understood as a communication protocol between the counter c and the CAS register r).

The inc method (body) is using a CAS register r to implement the increment. So, when the counter c receives the message inc, it send a read to its counter, and then tries to increment the contents of the CAS register and depending on whether this has worked or not, either the updated value is returned or another attempt to increment is made. On a sunny day, the program executes as follows (and we’ll defer the rainy day to later in another reply).


LaTeX

\begin{msc}{\texttt{inc} execution}
\declinst{c}{counter}{c}
\declinst{r}{
\begin{tabular}[b]{c}
CAS\tabularnewline
register
\end{tabular}
}{t}
\mess{\texttt{inc}}{envleft}{c}
\mess{\texttt{read}}{c}{r}
\nextlevel[2]
\mess*{2}{r}{c}
\nextlevel[2]
\mess{\texttt{cas}(2,3)}{c}{r}
\nextlevel[2]
\mess*{\textit{ok}}{r}{c}
\mess*{3}{c}{envleft}
\end{msc}


  1. Programs can also be the arguments of methods. “The arguments to methods typically consist of first-order values (integers, strings), but they may also be programs (described below); indeed, this possibility will be key to our treatment of transactions in § 8” op. cit. ↩︎

  2. This is in line with the object as message transformer perspective, which is independent of any choice of programming language syntax. ↩︎

2 Likes

First, as a clarification, the program execution you have in mind is the following execution of the message Entry sent to object o1.

Latex
  \begin{msc}{AVM program execution}
    \declinst{user}{user}{\faUser}
    \declinst{o1}{\colorbox{SpringGreen}{o1}}{}
    \declinst{o2}{\colorbox{Dandelion}{o2}}{}
    \declinst{o3}{\colorbox{Aquamarine}{o3}}{}

    \mess{entry}{user}{o1}
    \mess{}{o1}{o2}
    \mess{}{o2}{o3}
    \nextlevel[2]
    \mess*{}{o3}{o2}    
    \mess*{}{o2}{o1}
    % \mess[color=lightgray,dotted]{?}{o1}{user}
  \end{msc}

I think, the topic of locality is secondary. Isn’t it rather a question of who/what is looking up whom/what, which causal dependencies exist between the look-ups and which groups of lookups are performed atomically?

1 Like

I’ll just mention briefly how this relates to the current state of GOOSE. In GOOSE, AVM programs are translated to what we call “Anoma programs”, which essentially describe transaction functions: they fetch some objects and submit a single transaction. Where the objects are fetched from or where the transaction functions / Anoma programs run is abstracted away - we assume that it just can be done somehow.

1 Like

Thanks for pointing out the need to clarify that there are different kinds of programs around. If possible, I’d like to aim for the following picture where an edge A \to B as A should have a counterpart in B and ideally the counterpart is obtained via compilation.

AVM programs → goose-AVM programs → goose-Anoma programs

However, the AVM programs still need a way to take care of object creation and are missing a way to bring the users into the picture (in particular of who is receiving outputs of method invocations “external” to the AVM, so to speak).

1 Like

Yes, exactly.

Yes, I agree, it would have been more precise of me to say where in system history the look-ups happen (which is related to which node actually performs the look-ups insofar as the node must know the relevant part of the history).

That makes sense – but do we have the ability to choose “how much” of the program we evaluate (in terms of which lookups are performed vs. which are encoded into a transaction function to be sent onwards)? [not necessarily saying that we need to add this immediately, just to clarify the model]

This I don’t quite follow. What are goose-AVM programs as distinct from AVM programs, and what are goose-Anoma programs as distinct from transaction functions and/or resource logics?

1 Like

That makes sense – but do we have the ability to choose “how much” of the program we evaluate (in terms of which lookups are performed vs. which are encoded into a transaction function to be sent onwards)? [not necessarily saying that we need to add this immediately, just to clarify the model]

No, this is currently abstracted away. There are just object fetches and GOOSE doesn’t specify how they’re implemented.

2 Likes