This is a diagram intended to display the basic design components for the first version of the AL kernel as I intend to write it in Elixir: processes, algorithms, other basic details. Many things did not go into consideration here and it will need to be heavily iterated upon, but I hope it will be instructive and start a discussion. I expect the event sourcing side to undergo heavy changes.
One thing that has, for example, gone under no iteration here so far, is the topic of efficient storage and retrieval. ETS/Mnesia donât really do much indexing except constants indexing, so there are probably a lot of optimisations we can (and eventually will need to do) here.
I also hope that itâs at least semi-readable! A draft implementation will now begin based on this design. Like things often go, I assume many things will become clearer as I go and I hope to note these as I go.
This seems fine (Iâll give a more substantive feedback in a future response). The only thing to note in the future is that our subset of rv32im from elixir doesnât handle gensever sends. But we can figure out the implementation for that use after we get a better idea of what weâre doing.
This might actually be fine depending on what all we send into a rv32im environment but Iâm unsure currently
Thatâs something I was thinking about during design as well yeah. I think this approach is more incremental given we can integrate it into the current local domain. Since we have the data for objects we can still use the approach where we send objects to a resource machine environment like the current implementation in Local Domain is doing, whereas if we care immediately about compiling to rv32im we might as well say weâre just using Elixir as the compiler language, the local domain code itself is not relevant, and thereâs the added burden of having to implement the âdata storeâ side immediately. Thatâs my understanding of the trade-offs at least.
Thanks for the write-up! I very much support going ahead with prototyping, even if we donât have the full compilation stack working yet.
I want to check my understanding of the basic model here, for two reasons:
Iâd like to have a clear picture (both for myself and for others) of what the essential, minimal set of structural design choices here is (vs implementation choices made for practicality, efficiency, etc.).
I would also like to quickly write my own prototype of your design (time pending), so that we can compare.
My understanding of the basic model (based on our discussions @l4e21 and previous threads) is that of a monotonic, relational object system based on an event log. I understand this to imply, at minimum, the following structure:
The single, canonical, and complete source of truth of system history is an append-only event log, where the only possible write operation is to append an event, and the only possible read operations are to read a single event and to scan (for events matching a predicate).
At least in the initial model (before we deal with a distributed case), the event log is totally ordered (i.e. a list) and where we are in âsystem timeâ is just an index into that list.
We have some notion of âAL programâ which can be ârunâ. âRelational object systemâ implies a particular program structure here (objects, methods, backtracking, etc.), and there are design decisions to be made there that I havenât thought through yet, but regardless there must be such a thing as a âprogramâ and a definition of what it means to ârunâ it. Abstractly, the AL abstract machine can be understood as an object which accepts one kind of message (a program), runs it, and returns (in a message response) the results.
Since the event log is the single, canonical, and complete source of truth, the result of running a particular AL program can be fully determined by (a) the event log against which it is run and (b) the program itself.
This means that whatever âwriteâ operations we support in programs must ultimately decompose into event-log-appends, and whatever âreadâ operations we support in programs must ultimately decompose into event-log-reads/scans.
An âobject storeâ, as you describe it, is effectively a cache, with perhaps two purposes:
a. Allow programs to control when their writes are âcommittedâ (to the event log)
b. Keep a convenient in-memory representation of objects for efficiency purposes (as opposed to scanning the event log every time).
Before I continue further, I want to check all of these assumptions â are they accurate? Am I missing anything fundamental here?
Yes, itâs worth noting that I wasnât really intending the developer to work on the raw event log directly but rather make use of a bootstrapped event log object. Especially since, given that the object system is a hydration of the event log, working with the object system ought to be tantamount to working on (appending to) the event log.
I think this is a good first assumption yes.
Yes, I would hope we can as much as possible maintain the view that the event log contains the information about all the objects in the system and that thereâs no âhidden informationâ- rather someone should be able to replay the event log and be able to say that they have the exact same system as someone else who replayed it. That would be nice if we can maintain this property as much as possible.
Thanks, I think thatâs enough for me to work on a minimal prototype based on my own understanding, then we can compare (and my efforts do not need to block anything).
Iâd also then like to note a correspondence that I think will be important: in a system with partial information (where different participants can only see parts of the history), but some commitment to the history is fully public information, it should be possible to craft the event log operations in such a way that everyone can share a single event log, but the specific operations which they can perform (e.g. on which objects) are limited based on the information which is available to them. The âstate structureâ part of @vveilnâs post here provides a concordant picture of this overall structure.
Concretely, for example, we could build an event log on top of the current RM:
Events are represented as resources.
Appending to the event log ~ creating a resource.
Reading from the event log ~ reading a previously created resource.
Rules for when it is valid to append a particular event to the event log are enforced by resource logics (where the RM structure then allows these rules to be of the form âit is valid to append event X iff. events Y and Z exist in history and have not yet been marked as usedâ).
The RM structure then enforces that everyone can share an event log and that any appends to that log will follow the rules, even though different participants may have only partial information (which may limit which appends they can perform).
Representing operations on a particular object as a series of events (ordered subset of the event log) is a natural fit here, since it fits this form of rule (the previous version of an object can only be used to create the next version once) â this is (roughly) what we did with Goose.
My question for you here is then: what form of rules (for when certain appends to the event log are valid) have you been envisioning? Would this be (implicitly) defined by the rules of the abstract machine? I do think that we should have some notion of whether an event log is âvalidâ (which these rules would determine).
I will naively hypothesise about how this may work for AL, hoping that anything I say thatâs wrong will be called out, since there are many cryptographic constraints that I do not fully understand. So far in our design, weâve been working from the local side and extending outwards. In this sense, the resource machine would exist as a protocol on top AL. Letâs suppose that each object can implement a simple protocol is_resource which determines whether that object is a resource (some objects are purely local for the user and are thus not resources), and in this sense, the âresource event logâ for the user is a filter on the event log for objects to which this protocol applies. Now not all the objects in this resource event log are readable, some are encrypted, but I assume the user knows that they are at least resources.
In this case, then objects may also implement methods like can_consume, can_create, methods which act as rules on what kinds of transactions can use or create these resources. Supposing that all the resources can be read within a transaction, we assume that, given a transaction, we donât need extra information from âoutsideâ this transaction in order to figure out whether itâs valid (I hope this is a valid assumption, if not then my understanding of the resource machine is very wrong).
The execution of the resource machine must happen therefore in one of two ways:
By consensus (there are many questions outstanding about how this will be implemented within AL, still need to design these aspects with Jamieâs help)
By delegating to a solver, providing them the ability to decrypt the resources in question (and therefore the solver would be a local domain)
One thing that does not go into this consideration of the resource log is time. In AL, âsystem timeâ is entirely local, but in the distributed case of course, time is much more interesting. We cannot utilise system time for resources in any way to determine order but would need some other first-order time attribute that is specific to the resource event log.
I know there are many things missing from this equation but please let me know all of your thoughts on the topic. Iâd be happy to organise a further design session on this. For example, as for checking whether the resource event log is valid, Iâve not had any ideas on this.
This is certainly possible, and in some sense necessary, if we want the entire system to run on an AL kernel/core, as that must ultimately run locally (and the resource machine concerns distributed state).
My point is more that what I like about this design is that it abstracts persistent storage as an âevent logâ, which itself can be modeled as an object, with something like the following protocol:
append(event) â 0|1 (success/failure)
scan(predicate) â {event_ids} (set of event identifiers which match the predicate)
read(id) â maybe event (event with the given identifier, if found)
Implicitly, a given event log has some rule about whether a particular event is valid, which can depend only on the previously appended events.
In practice, that protocol could be implemented in many different ways, e.g.
With a literal local in-memory object that keeps a list of events
With a local backing k/v store: events are stored at events/{id}, total length is stored at event_count.
Append reads the current count, writes to events/{current_count}, increases the current count by one. Rules as to which appends are valid must also be enforced here (this may require scanning past events).
Scan reads through the entire k/v store and returns matching event ids [obviously weâd need to make this more specialized/efficient in practice]
Read reads events/{id}
With a distributed state such as that provided by the resource machine
Append creates new resources (weâd probably have a single-event-to-many-resource relation here, since appending an event is the atomic operation <> corresponds to a RM transaction which is the atomic operation). Rules as to which appends are valid are enforced by the resource logics.
Scan reads through resource commitments and returns commitments where the resource data satisfies the predicate. What data can be returned will be limited to what the user can decrypt
Read reads a particular resource. Similarly, whether this succeeds will be limited to what the user can decrypt
If we do this properly, everything built on top should not need to care how the âbacking storageâ actually works, or even if it is distributed, as long as the protocol is followed. Alternatively phrased, we should be able to run AL kernel (backed by distributed storage) on top of the RM on top of AL kernel (backed by local storage). Does that make any sense / do you see what Iâm gesturing towards here?
Thanks for clarifying, I have a much better idea of what youâre talking about now and itâs very interesting to me. Certainly, it makes sense that given events are resources, one ought be able to understand AL as a hydration of events-as-resources.
The definition of scan and read sound fine. As for append, is a single append certainly an atomic operation in this view? In the AL design above, the top level âmessage send/queryâ (aka a valid solution) is a transaction on the event log and can perform multiple writes during the process (which go into a transaction overlay, and then sent to the event log when the transaction is committed). Maybe weâre using a different specific use for the word transaction here, or maybe thereâs some special case youâre considering? The concept in AL of âtransactionâ and the concept in the RM of âtransactionâ are different
To me, in this fundamental concept of an âevent logâ, an âeventâ is simply whatever datum we can atomically append to history. The event log (itself) doesnât know what that datum means. The event semantics, or how the event structure is interpreted, is a matter for the specific AL object model semantics. For example:
In a very simple AL without multi-object transactions, only one message to a single object can be sent atomically, so an event would just contain something like {object_id, message, response}, or something like that.
In a more complex AL with multi-object transactions, multiple messages between multiple objects can be atomic (either they all happen or they donât), so an event would contain a list of {object_id, message, response} structs (or something like that).
So Iâd say that in my definitions what makes an event an event is exactly that it is whatever can be atomically appended to history, and different AL-flavors could have different definitions/semantics for what operations that can include and how event data is interpreted. What doesnât make sense to me, though, is âatomically appending multiple events to history in the same transactionâ â that seems inelegant, since for any system in which we did that we could instead have just crafted a more complex event type which captured all of the changes that a single transaction made. Am I missing something essential here?
Your response makes me think of another point, which is that in order to craft the appropriate event (to append), the user may want to do reads/scans and craft the event based on that (good simple example: increment the value of a counter object). Is that part of what youâre thinking about? That makes sense to me, but I donât see any relationship to the question of whether (the writes performed by) a transaction correspond to one event or many. Let me know if Iâm missing something.
Yeah this makes sense, thanks again for clarifying.
[quote=âcwgoes, post:10, topic:2581â
, since for any system in which we did that we could instead have just crafted a more complex event type which captured all of the changes that a single transaction made.
[/quote]
The idea behind the AL event sourcing Kernel is that the event log describes events over the object system, which can be either
Some info about an object was set (a class, a superclass, some slots, an object was made runnable, a method was assigned to a class, etc)
Some info about an object was removed
Some info about an object was updated
A simple static event list that we can easily query as pure data. There shouldnât be a need to craft complex event types since, well, things would become more complex. I think conceiving âmore complex eventsâ as transaction boundaries make a lot more sense because this is metadata. I canât really fully argue my point but I think intuitively that it would be best to stick to this. I think if we started doing mega-events indexing would be difficult and weâd need schemas and also weâd need to reason about whether we could split the events up into smaller atomic operations. Thereâs semantic coupling that I think doesnât need to exist and thereâd be migration stuff again.
Would they? In your example, and in general, it seems to me like weâre just talking about the difference between:
type SimpleEvent = ObjectSetInfo | ObjectRemoveInfo | ObjectUpdateInfo
versus
type ComplexEvent = [SimpleEvent]
Thatâs not much of a difference in complexity â in fact, the only change is that the atomicity boundary is different: with the latter type, we can set info for one object and remove info for another in a single event, but with the former we cannot. This doesnât seem to me like it would change indexing complexity very much.
Now, it would be possible to implement my âcomplex eventâ event log with your âsimple eventâ event log and transactions (where I can atomically â in the ACID sense â append multiple simple events corresponding to my complex event), but in order to do that I need a whole transactional memory system, which seems ⌠annoying. Implementing the âsimple eventâ event log with the âcomplex eventâ event log, by contrast, requires nothing more than a 1-element list.
Ah, I see, I have been using âatomicâ to mean âatomic in the ACID senseâ, which is at least a somewhat different and not always overlapping definition. Some concept of an âatomicâ [smallest possible divisible thing] âoperationâ (something done to an object) does make sense to me.
To me the question of which writes are atomic (in the ACID sense) is something that should be up to the programmer (expressed in the program), so Iâd answer your questions with
Should these writes be merged into one event? It depends on what the program (or perhaps code of existing objects) says should be done.
At what level? At whatever level interprets the âtransactionalityâ the program specifies.
What does the nesting buy us? At some level of the system, in order to have atomic-in-the-ACID-sense changes, we must have an atomic-in-the-indivisible-sense thing that either âhappensâ (is counted as part of history) or not. We can call that thing a âtransactionâ instead of an âeventâ but as far as I can see thatâs really just a terminological difference (then weâd have an append-only transaction logâŚ).
My mistake, this explanation makes it seem much more like a semantic difference. I was confused by this statement specifically
As it seemed to imply hacking on or extending the primitives of the event log, which is definitely not something we should be thinking about rn (this would introduce a lot of complexity, we have some ideas of how this could be done but itâs definitely out of scope for now). As long as there is a fixed set of primitives (as systems like XT do) then I think that our difference here is just semantic (we can work out which is the right phrase for it in idle conversation). As for determining the boundaries of atomicity-in-the-acid-sense, I had pretty much figured that a transaction would just be a list of events, so it sounds like we basically agree