This post is a WIP, posting to save – please ignore for now.
What might it look like to design a local AVM?
This post aims to explore this question. Note that a local AVM is the same thing as a transparent AVM, for by “local” we can infer both (a) an ordering domain and (b) a trust domain, so there is no need to regulate information disclosure (except between applications running in the AVM – but not in terms of the operations of the AVM itself).
The reader of this post may find it more comprehensible (I hope) with the prior context of @graphomath’s current draft AVM paper. I will attempt to define concepts in a standalone way, but for the sake of brevity I will elide the mathematical precision which the paper renders more explicit and may be helpful.
What is the AVM?
Abstractly, the AVM is a state machine, i.e. a function of the form
where:
- S is the type of AVM state (a collection of objects).
- I is the type of AVM programs.
- O is the type of AVM outputs.
An instance of the AVM is simply the pair of a specific state s (an inhabitant of S) and this transition function. Note that an instance of the AVM is thus itself an object. This will come in handy later.
What is the AVM state?
Roughly, the AVM state consists of a collection of objects, each associated with a unique identifier. For the sake of simplicity in this exposition, we can assume that identifiers are just natural numbers, although this is not necessary in practice. An object is a pair of an input history and a characteristic function which defines the future behavior of that object (see 4.1 of the ART report for a precise definition).
What are AVM programs?
An AVM program p
is either:
x <- INPUT; p'
, which:
a. Reads the input (the message sent to this object)
b. Stores the result in the variablex
c. Continues execution of programp'
in the updated context (now includingx
).x <- <VALUE>; p'
, which:
a. Stores the value<VALUE>
in the variablex
.VALUE
may be either a literal or another variable (so this instruction can be used to copy variables).
b. Continues execution of programp'
in the updated context (now includingx
)x <- FUNCTION_CALL <FUNC> <DATA>; p'
a. Calls functionFUNC
on valueVALUE
.FUNC
must be a computable function. Functions are represented by a multiformat.VALUE
may be a literal or a variable.
b. Stores the result in the variablex
.
c. Continues execution of programp'
in the updated context (now includingx
)x <- CREATE <FUNC>; p'
, which:
a. Creates a new object with characteristic functionFUNC
.FUNC
must be a valid AVM program.
b. Stores the new object’s identifier in the variablex
.
c. Continues execution of programp'
in the updated context (now includingx
).DESTROY <ID>; p'
, which:
a. Destroys the object with identifierID
.ID
may be a literal or a variable.
b. Continues execution of programp'
.x <- CALL <ID> <MSG>; p'
, which:
a. Sends the messageMSG
to the object associated with identifierID
.ID
andMSG
may be literals or variables.
b. Stores the result in the variablex
.
c. Continues execution of programp'
in the updated context (now includingx
).TODO: This is where we might want to allow “generalized method calls”, i.e. methods can also be programs which execute multiple calls (but make no state changes). This could probably also be done w/EVAL though.
SWITCH x PROGS
, a generalized case analysis, which:
a. Selects a specific program inPROGS
depending on the value of the variablex
.PROGS
is just a function (map) from values to programs.
b. Executes that program in the current context.TODO: I think this instruction can be implemented w/FUNCTION_CALL and EVAL.
x <- SCRY <FILTER>; p'
, “scry” or object lookup, which:
a. Searches for objects matchingFILTER
b. Returns the results in the variablex
(a set of identifiers)
c. Continues execution of programp'
in the updated context (now includingx
)x <- SELF; p'
, which:
a. Stores the identifier of this object in the variablex
b. Continues execution of programp'
in the updated context (now includingx
)EVAL x; p'
, which:
a. Attempts to interpret the value ofx
as an AVM program
b. Executes that program in the current context
c. Continues execution of programp'
in the updated contextTODO: Somehow we need to pass variables to the program.
REVERT
, which:
a. Reverts all execution. Note that this revert cannot be caught.RETURN x
, which:
a. Discards the current context
b. Returns the value associated with the variablex
Note: this program form could be represented as a S-expression.
Contexts are just maps from symbols to values. Values are untyped data. A given variable symbol may be written to only once, but read from any number of times.
TODO: Do we want more “first-class” AVM evaluation, e.g. including the state?
What are AVM outputs?
The AVM can be executed in two modes:
- In “normal mode”, the output is just whatever is returned by the top-level program.
- In “trace mode”, the output is whatever is returned by the top-level program, plus a full trace of all calls and responses which took place during program execution.
What is the difference between computation and messaging?
The difference between computation and messaging is simply whether or not there is some need to track any state beyond the message response. Computation could also be modeled as creating an object, sending a message to that object, getting a response, and destroying the object (thereby allowing the object history/state to be garbage collected as it can no longer be referred to), but this seems less elegant.
Functions perform computations. They can be called any number of times and have no state. Functions must be pure functions in the mathematical sense, i.e. (potentially partial) maps from the domain to the codomain. Failure is equivalent to REVERT
.
Process objects process messages. They can be created (once), called (any number of times), and destroyed (once). Process objects cannot themselves store any data – they can only create, destroy, look up, and call other objects, and perform computations.
Data objects contain data. They can only be created (once), read (any number of times), and destroyed (once). Data objects cannot themselves execute any computations, perform any calls, etc. – they are “just data”.
TODO: This needs to be reflected in the instruction set.
Ownership
Objects each have an owner. The owner of an object is either:
- the object which created this object, if this object was created by another object, or
- the user, if this object was created by a top-level AVM program
In the implementation, the user may be represented by a sentinel “zero-identifier”.
Only the owner of an object can call the object or scry for it. The owner of an object cannot be changed. A similar effect can be achieved by destroying the previous object and creating a new one (through a different intermediary object, which will become the owner of the new object).
This rather strict limitation ensures that call structures always take the form of an acyclic DAG, i.e. child can never call parent.
Note: this ownership structure seems somewhat similar to Unix processes, for which I wasn’t able to find a good formalism in a few minutes of searching, but this might be worth exploring more.
Properties
- Hierarchical equivalence
- AVM state can be segmented by object
- Equivalent to running a little AVM instance w/just that object’s state
- Other objects cannot be called by non-owner
- Thus SCRY is equivalent to visibility over the object’s own history
- Equivalent to AVM ART definition of “object”
- Behavioral transparency
- Objects can be replaced by behaviorally equivalent objects
- Transactional semantics
- Computation can be reordered up to transactional equivalence
Misc notes
- This AVM does not feature unification. Unification would involve objects defined relationally instead of functionally, I think. It’s a bit further from the current paper. Worth exploring in the near future though.
- This AVM also does not feature any kind of typesystem, it is untyped. A typesystem should be possible to build on top.
- Currently, this AVM does not support reflection. This can probably be added but I haven’t thought too much about it yet and we want to be careful not to break the above properties.