The following is a version of the Instruction set from my notes, typechecked with Agda v2.7.0.1.
- AVM Instruction Set - WIP Notes (you might find it more convenient to read).
Forgive the length of the content; it’s a compilation of the last few days’ work.
Thanks to @graphomath, @jan, and @Lukasz for early feedback and discussions.
Last updates:
- Change Agda@X to
Xreferences.
Motivation and Context
- Sources of inspiration:
As mentioned in this plan, this post only addresses the AVM instruction set syntax.
Disclaimer
This post is a work in progress. It is not yet complete and may contain errors.
Everything below is a proposal for defining the final instruction set of the Anoma
Virtual Machine and how to define AVM programs.
Below, you’ll see Agda code snippets and possibly references to defined terms.
If you want to read with code highlighted and clickable, I invite you to visit my notes.
What is the AVM?
AVM stands for Anoma Virtual Machine. It is a deterministic,
transaction-oriented virtual machine for message-driven objects. We
will unfold all these concepts here.
The concept of object is the deterministic sequential object explored in the
Objects module. Briefly, an
object is a function from input sequences to output sequences. In this view, an
object “reacts” to messages, input sequences, and produces output sequences,
possibly empty.
What is a message in the AVM?
Messages in the AVM are represented as S-expressions. The Val type defines
the structure of these S-expressions. It is a sum type of integers, booleans,
strings, pairs, and lists.
data Val : Set where
VInt : ℕ → Val
VBool : Bool → Val
VString : String → Val
VPair : Val → Val → Val
VNil : Val
VList : List Val → Val
This S-expression format allows objects to exchange structured data while
maintaining a simple, uniform representation. We define semantic aliases
to clarify the message-passing nature of object interaction:
Message : Set
Message = Val
Inputis the type of inputs to an object.Outputis the type of outputs from an object.
Input : Set
Input = Message
Output : Set
Output = Message
What is an object in the AVM?
The object model in the AVM is minimal. An object is a record with a history of inputs and a
partial function from input sequences to output sequences.
record Object : Set where
constructor mkObj
field
history : List Input
object-type : List Input → Maybe (List Output)
Given an Object, one can query its output, transition the object to a new state by
processing new input, or determine whether it can process an input, among other
operations.
The AVM as a state machine
As a virtual machine, the AVM is a state machine that executes
programs—sequences of instructions or events—within an ambient execution frame.
Programs interact with this external environment through the execution of
instructions. The environment provides the necessary context for program
execution: the objects being operated upon, their inputs, and the outputs
they produce.
Central to the AVM is the notion of an instruction and event. We model the AVM’s
instruction set as an algebraic datatype whose constructors precisely define
the admissible state transitions.
With an instruction set type in place, the semantics of AVMProgram can be
defined in terms of interaction
trees. From these, we derive an
operational semantics as a small-step transition system, enabling formal
verification of properties such as determinism.
State of the AVM: The Execution Environment
Programs run on objects, and objects live in a node/controller. Think of
a node/controller as a server or computer that hosts the objects, and the AVM
instance as another program running on the node/controller where we can interact
with the objects.
Node identifiers are aliased to NodeId.
NodeId : Set
NodeId = String
Objects are identified by ObjectId. The specifics of the ObjectId type
are not important for the AVM. It is just a unique identifier for an object. Say
for example that the ObjectId is a pair of a NodeId and a String.
ObjectId : Set
ObjectId = NodeId × String
We will also assume that:
- all objects live in the same node (forever),
- the ambient execution frame is the node itself, and
- no other node is considered.
Future work could investigate how these assumptions could be relaxed.
Store type
As part of the execution, the AVM provides a view of the store of all objects in the
AVM. However, this view is partial because an object may not yet exist in the
store or may not be available. This is represented by the Store function type, where
a store is a partial function from ObjectId to Object.
Store : Set
Store = ObjectId -> Maybe Object
Transactional semantics
The AVM is a transactional machine. Programs can induce discrete state changes
as atomic blocks, ensuring that state changes are either fully applied or have
no effect at all. Therefore, some “memory” is needed in the current instance of
the AVM to store these tentative state changes—precisely the inputs sent to each
object. This is represented by the TxLog function type, where a TxLog is a
partial function from ObjectId to Input.
TxLog : Set
TxLog = ObjectId -> Maybe Input
As in TxLog, we use the Tx prefix to denote transaction-related terms,
following standard literature conventions. Note that the term “transaction” in
the Anoma ecosystem has a different meaning than here: it can refer to Resource Machine
transactions (the lowest level of atomicity that AVM programs ultimately compile
to).
Each AVM transaction is uniquely identified by a TxId.
postulate
Hash : Set
TxId : Set
TxId = Hash
Ctx datatype
At any given point in the execution of the AVM, the current object is the one
that is being executed. The current input is the input that is being processed
by the current object.
record Ctx : Set where
constructor ctx
field
self : ObjectId
input : Input
traceMode : Bool -- Whether to trace the execution of the AVM. handy?
State datatype
The AVM maintains an execution state that captures all information needed for
transactional computation. The state machine processes instructions by taking
the current state and an input sequence as input, producing a new state, output,
and execution trace.
The State record encapsulates the transactional context:
record State : Set where
constructor st
field
store : Store -- Persistent store of committed objects
L : TxLog -- Tentative write-set for the current transaction
tx : Maybe TxId -- Optional identifier of the open transaction
y : Ctx -- Execution context, it includes the current object and input
Each field serves a specific role in transaction processing:
storeholds objects that have been successfully committed.Laccumulates writes within the current transaction (uncommitted changes).txtracks whether a transaction is currently active (some id) or not (nothing).yprovides additional execution context, it includes the current object and input.
Note that the input to the program is present in the execution context y—this
is the input being processed by the current object.
Then, a first approximation of the signature of the state transition function is:
AVM : State -> Instruction R -> (State × R × Trace)
Here, Instruction is a type-level function from the instruction set of the AVM
to the result type of the instruction. Trace is a list of instruction-specific
AVMEvent that occurred during the execution of the instruction.
record AVMEvent : Set where
constructor ev
field
event-name : String
event-data : List Val
event-result : Maybe Val
event-error : Maybe Val
-- you name it
Trace : Set
Trace = List AVMEvent
Building on this view, the signature of the AVM state transition function is:
AVM : State -> Program R -> (State × R × Trace)
Here, Program is a set of interaction trees over the instruction set.
Now it is time to define our first candidate for the instruction set of the AVM.
Baseline ISA
The AVM instruction set architecture (ISA) defines the primitive operations
available to programs. Instructions are modeled as an inductive datatype where
each constructor specifies its return type. ISA is the type-level function
signature of the instruction set.
ISA : Set
ISA = Set -> Set
The AVM requires additional machinery to handle introspection and debugging.
We need to track the safety of operations—that is, whether they are safe
to use in a safe context.
Safety level for operations
Two safety levels are defined: Safe and Unsafe.
data Safety : Set where
Safe : Safety
Unsafe : Safety
Filtertype: a predicate onObjectId.
Filter : Set
Filter = ObjectId -> Bool
Dynamictype: a dynamic metadata representation.
postulate
Dynamic : Set -- Dynamic metadata representation
The baseline instruction set Instr₀ provides ten primitive operations:
data Instr₀ : Safety -> ISA where
-- Transaction management
beginTx : Instr₀ Safe TxId -- Start new transaction, returns its ID
commitTx : TxId → Instr₀ Safe Bool -- Commit transaction, returns success
abortTx : TxId → Instr₀ Safe ⊤ -- Abort transaction, always succeeds
-- Object lifecycle
createObj : Input → Instr₀ Safe ObjectId -- Create object from input
destroyObj : ObjectId → Instr₀ Safe ⊤ -- Remove object from store
-- Object interaction (pure message-passing)
call : ObjectId → Input → Instr₀ Safe Output -- Send input, await output
-- Safe introspection
self : Instr₀ Safe ObjectId -- Get current object ID
input : Instr₀ Safe Input -- Receive external input
scry : Filter → Instr₀ Unsafe (List ObjectId) -- Query ALL objects
reflect : ObjectId → Instr₀ Unsafe Dynamic -- Inspect ANY object
Operations are marked as unsafe when they violate core principles of the object
model or pose systemic risks. Unsafe operations break encapsulation by accessing
object internals without going through the object’s interface, as is the case
with reflect. They bypass access control by allowing unrestricted inspection
or discovery of objects, potentially exposing sensitive information or
implementation details. They also risk unbounded computation by triggering
expensive global operations that scale with the total number of objects in the
system.
Consider scry: it applies a filter predicate across all objects in the
store. In a large system with millions of millions of objects, even a simple filter could
cause significant performance degradation. A malicious or poorly written filter
(e.g., one that always returns true) could force enumeration of the entire object store.
Similarly, reflect allows external inspection of any object’s metadata
without that object’s consent, breaking the principle that objects should
control what information they expose by processing a message—which
is not the case with reflect.
Safe programs cannot use unsafe operations. This design enforces at compile time
that safe programs remain isolated from potentially dangerous system-level operations.
These safe programs constitute what we formally define as “AVM programs”—the
standard execution model for user-level code in the AVM.
AVMProgram : Set → Set
AVMProgram = ITree (Instr₀ Safe)
And, knowing the risks of unsafe operations, we can define a SystemProgram
as any program that can use unsafe operations.
SystemProgram : Set → Set
SystemProgram A = ITree (λ B → Σ Safety (λ s → Instr₀ s B)) A
Instruction Set Semantics
The Instr₀ instruction set comprises primitive operations organized into four
categories: transaction control, object lifecycle, object interaction, and
introspection. Each instruction includes its safety level in the type signature.
With Instr₀, we have the syntax, but not the semantics,
what happens when we execute them.
We can define a small-step transition system for the instruction set in Agda
to provide formal semantics for each operation. This would establish precise
operational behavior through state transitions. We leave this formalization
for future work. Below, we describe the intended semantics of each instruction
in natural language for now.
Transaction Control
Transaction control instructions manage atomic execution contexts. All state
modifications within a transaction are tentative until committed.
beginTx
beginTx : Instr₀ Safe TxId
Initiates a new transactional context and returns a fresh transaction
identifier. All subsequent state modifications are logged to the transaction’s
write-set until the transaction is either committed or aborted. Transactions
provide atomicity: either all changes succeed or none do.
commitTx
commitTx : TxId → Instr₀ Safe Bool
Attempts to commit the transaction identified by the given TxId. It is
a synchronous operation that returns a boolean. Returns
true if the commit succeeds, persisting all logged changes to the store.
Returns false if the transaction cannot be committed (e.g., conflicts with
concurrent transactions or the transaction was already finalized).
abortTx
abortTx : TxId → Instr₀ Safe ⊤
Aborts the transaction identified by the given TxId, discarding all tentative
state changes in its write-set. The store reverts to its state before beginTx
was called. This operation always succeeds and returns unit.
Object Lifecycle
Object lifecycle instructions manage the creation and destruction of objects in
the store.
createObj
createObj : Input → Instr₀ Safe ObjectId
Creates a new object in the store from an Input. The Input
may be used to specify the object’s kind, initial program code, and
metadata. Returns a fresh ObjectId that uniquely identifies the new object.
Object creation is transactional: if the enclosing transaction aborts, the
object is not created.
destroyObj
destroyObj : ObjectId → Instr₀ Safe ⊤
Marks the object identified by the given ObjectId for destruction. The object
is removed from the store, and subsequent references to this ObjectId will
fail. Destruction is transactional: the object remains accessible within the
current transaction until committed.
Object Interaction
Object interaction is achieved through pure message-passing, preserving object
encapsulation. Message passing is the only way to interact with objects in the AVM.
call
call : ObjectId → Input → Instr₀ Safe Output
Performs synchronous message passing to the object identified by the given
ObjectId. Sends the input and blocks until the target object produces
an output. The output is returned to the caller.
Introspection and Context
Introspection instructions query the execution environment and object metadata.
self
self : Instr₀ Safe ObjectId
Returns the ObjectId of the currently executing object. This instruction is
essential for recursion and self-reference, allowing an object to pass its own
identifier to other objects or invoke itself. See also the use of self in
defining purely functional resources in the AVM.
input
input : Instr₀ Safe Input
This instruction blocks until input is available from the caller or external context.
This is the input being processed by the current object from Ctx.
scry (Unsafe)
scry : Filter → Instr₀ Unsafe (List ObjectId)
Queries the store for objects matching the given predicate Filter. Returns a
list of ObjectId for all objects satisfying the filter criteria. Marked as
unsafe because it can enumerate all objects in the system, potentially causing
performance degradation.
reflect (Unsafe)
reflect : ObjectId → Instr₀ Unsafe Dynamic
Retrieves metadata about the object identified by the given ObjectId. Returns
structured metadata. Marked as unsafe because it bypasses object encapsulation.
Bytecode Encoding
The typed Instr₀ representation provides type safety and enables formal reasoning, but practical AVM implementations require a serializable bytecode format for storage, transmission, and execution. The bytecode encoding and decoding mechanisms are defined in the Bytecode module, which provides:
- Untyped bytecode representation with opcodes
- Encoding functions from typed instructions to bytecode
- Decoding with runtime type checking
- Program serialization
See AVM Bytecode Encoding for the complete specification.
Examples
Example: Transfer
This example demonstrates an atomic transfer operation between two account objects. We assume the following context:
- An Account object type is defined.
- Account objects support
withdrawanddepositmethods that modify a balance field - Two Account objects are identified by
fromAccandtoAcc
We define helper functions for creating messages:
withdraw : ℕ → Message
withdraw n = VList (VString "withdraw" ∷ VInt n ∷ [])
deposit : ℕ → Message
deposit n = VList (VString "deposit" ∷ VInt n ∷ [])
ok? : Val → Bool
ok? (VString "ok") = true
ok? (VString _) = false -- Any other string
ok? (VInt _) = false
ok? (VBool _) = false
ok? (VPair _ _) = false
ok? VNil = false
ok? (VList _) = false
Given two object identifiers and an amount, we define a transfer function with type signature:
kudosTransfer : ObjectId → ObjectId → ℕ → AVMProgram Val
The implementation ensures atomicity through explicit transaction management:
kudosTransfer fromAcc toAcc amount =
trigger beginTx >>= λ txId →
trigger (call fromAcc (withdraw amount)) >>= λ r₁ →
(if ok? r₁ then
(trigger (call toAcc (deposit amount)) >>= λ r₂ →
if ok? r₂ then
(trigger (commitTx txId) >>= λ success →
if success then
ret (VString "transfer-complete")
else
ret (VString "commit-failed"))
else
(trigger (abortTx txId) >>= λ _ →
ret (VString "deposit-failed")))
else
(trigger (abortTx txId) >>= λ _ →
ret (VString "insufficient-funds")))
The transfer operation is atomic: either both the withdrawal and deposit succeed
and are committed, or the transaction is aborted and no state changes occur.
Example: Object Creation and Initialization
Creating a new counter object and performing initial operations:
createCounter : Val → AVMProgram ObjectId
createCounter initialValue =
trigger (createObj (VList (VString "counter" ∷ initialValue ∷ []))) >>= λ counterId →
trigger (call counterId (VString "increment")) >>= λ _ →
ret counterId
-- Using the counter
counterExample : AVMProgram Val
counterExample =
createCounter (VInt 0) >>= λ ctr →
trigger (call ctr (VString "increment")) >>= λ _ →
trigger (call ctr (VString "increment")) >>= λ _ →
trigger (call ctr (VString "get")) >>= λ v₃ →
ret v₃ -- Returns VInt 3
The counter maintains its state across calls, demonstrating the stateful nature
of objects created through createObj.
Example: Balance Update with Validation
Updating an account balance through message-passing with validation:
updateBalance : ObjectId → Val → AVMProgram Val
updateBalance account v = helper v
where
invalidMsg : Val
invalidMsg = VString "invalid-value-type"
helper : Val → AVMProgram Val
helper (VInt n) =
-- Validate non-negative
(if 0 ≤? n then
-- Get current balance via message
(trigger (call account (VString "get-balance")) >>= λ oldValue →
-- Update via message
trigger (call account (VList (VString "set-balance" ∷ VInt n ∷ []))) >>= λ result →
if ok? result then
ret (VString "balance-updated")
else
ret (VString "update-failed"))
else
ret (VString "invalid-balance"))
helper (VBool _) = ret invalidMsg
helper (VString _) = ret invalidMsg
helper (VPair _ _) = ret invalidMsg
helper VNil = ret invalidMsg
helper (VList _) = ret invalidMsg
Balance updates are validated before being applied through the object’s message interface.
Batch transfer
Calling a multi-method that operates on multiple objects simultaneously:
transferBatch : List (ObjectId × ObjectId × ℕ) → AVMProgram Val
transferBatch transfers =
processTransfers transfers
where
processTransfers : List (ObjectId × ObjectId × ℕ) → AVMProgram Val
processTransfers [] = ret (VString "batch-complete")
processTransfers ((from , to , amount) ∷ rest) =
kudosTransfer from to amount >>= λ result →
(if ok? result then
processTransfers rest
else
ret (VString "batch-failed"))
Batch operations can be made atomic by wrapping them in an appropriate transaction
boundaries.
Example: Object Destruction with Cleanup
Destroying an object after transferring its remaining balance:
closeAccount : ObjectId → ObjectId → AVMProgram Val
closeAccount accountToClose beneficiary =
-- Get final balance
trigger (call accountToClose (VString "get-balance")) >>= λ balance →
closeWithBalance balance
where
closeWithBalance : Val → AVMProgram Val
closeWithBalance (VInt n) =
if 0 <? n then
-- Transfer remaining funds
(trigger (call accountToClose (withdraw n)) >>= λ _ →
trigger (call beneficiary (deposit n)) >>= λ _ →
-- Destroy the account
trigger (destroyObj accountToClose) >>= λ _ →
ret (VString "account-closed"))
else
-- No balance, just destroy
(trigger (destroyObj accountToClose) >>= λ _ →
ret (VString "account-closed-empty"))
closeWithBalance (VString _) = ret (VString "invalid-balance")
closeWithBalance (VBool _) = ret (VString "invalid-balance")
closeWithBalance (VPair _ _) = ret (VString "invalid-balance")
closeWithBalance VNil = ret (VString "invalid-balance")
closeWithBalance (VList _) = ret (VString "invalid-balance")
This ensures no funds are lost when an account is closed - the remaining balance is
transferred before destruction using destroyObj
What’s next?
We just defined the instruction set syntax. This is the first iteration.
However, we need to define an interpreter for the instruction set.
Otherwise, how exactly does the execution of a program actually work?
The interpreter will be used to execute the instructions and produce
the output.
interpreter : State -> Instr₀ Safe -> (State × Output × Trace)
We already discussed how these interpreters should be defined, see
Interpreting ITrees.
Also, from here, it should be relatively easy to prove:
- atomicity (all-or-nothing per tx)
- determinism
