Why Does PROLOG Resist Distribution?
Over time when ingesting something as intuition, it becomes easy to forget the reasons why certain solutions do not work. I don’t think this problem is as easy as having a modality to assert or call on other controllers, as a brute force solution would suggest. Let’s try to enumerate some of the core issues with a brute-force distributed PROLOG- I’m sure I will fail to do this adequately but let’s give it a go. This article does not propose a solution, but merely seeks to map failure modes.
Conside a predicate @ which has the semantics of ‘if I see a predicate call, do a query against another controller, otherwise add this predicate to another controller’ and that it’s safe upon backtracking (assertions are undone). This seems, on its surface, like a simple solution which treats @ like ‘RPC for PROLOG’.
1
Here’s a hypothetical: I have a controller that has a predicate p(X)- which either finds a controller it knows about for which q(X) is true, or which, failing that, finds a true p(X) on a controller.
p(X) :- @(C, q(X)).
p(X) :- @(C, p(X)).
This could cause a loop, unless we narrow the search space of controllers such that they form a directed graph. XSB tries to avoid this problem by not calling the same goal naively, but uses tabling instead, and suspends computation at a fixed point instead of looping. In this case, a controller C1 which calls p(X) would mark p/1 as ‘in progress’, delegate to C2, which seeing p/1 already being in progress, would suspend computation, until a solution appears. But maybe a full discussion of this is out of scope for now.
2
Let’s discuss another case where things can be difficult-- assertions. Consider the following: I have a controller that, as a part of doing some work, makes an assertion against a controller and then immediately calls it.
p(X) :- @(c1, q(3) :- true), @(c1, q(X)).
If I have a second controller which at the same time attempts to assert q(4) :- true- how should this be handled? Should it add q(4) :- true and q(3) :- true in whatever order they were received, or reject new implementations? If writes interleave, then I can end up getting q(4). Well this is fine, but in the case of more elaborate predicates, like recursive ones, order really does matter. I would argue, therefore, that the assert and call here should really be part of one transactional unit and that writes shouldn’t be interleaved, otherwise I can get really unpredictable behaviour. After all, if I say
@(c1, factorial(0, 1) :- true), @(c1, factorial(N, X) :- N1 is N-1, factorial(N1, X1), X is N*X1)
Then I would hope that someone can’t interleave a predicate between these two.
3
If assertions are safe under backtracking, then my undone assertions can make things very strange for other controllers. E.G.,
p(X) :- @(c1, q(3) :- true), fail.
If another controller queries after q(3) was asserted, but before the failure, it will see the state of the c1 controller in a way that is ‘about to be incorrect’ and make judgements based on that. This makes things hard to reason about.
4
Negation can become problematic within a distributed context. E.G.,
@(c1, not(q(3)).
Has q(3) failed because it was logically asserted to be false, or has it failed because it is temporarily unknown? To me, this indicates that we need three-valued truth. XSB attempts to solve this again, using well-founded semantics. Not provable yet, should not necessarily collapse into false.
5
I hope so far we’ve essentially arrived at least on the point of how essential transactional semantics are for ensuring logical isolation and atomicity, but they don’t magically solve the problem. Let’s suppose I have a transaction scope in which we do the following work
@(c1, p(4)),
@(c2, p(3)).
And p(X) does this:
p(X) :- @(c3, p(X) :- true).
Okay, so now does c3 see the call to add p(3) or p(4) first? And since these assertions are part of the same transaction, can backtracking find other solutions? What happens if p(4) fails and p(3) doesn’t? In this case, they ought to both fail. The answer is, probably, we should want this to execute as sequentially as possible within a transaction, and so this requires something like a sequencer.
6
Cuts can also make things troublesome.
p(X) :- @(c1, q(X)), !.
p(X) :- @(c2, r(X)).
There might be some in-flight calls on c1 that need to be cancelled, or where answers need to be tossed away, after the cut. This will require some distributed agreement on what cancellation looks like.
Discussion
There are many problems with the concept of a ‘distributed’ PROLOG, and indeed if it was easy then books like this https://theswissbay.ch/pdf/Gentoomen%20Library/Artificial%20Intelligence/General/Agent-Oriented%20Programming%20-%20From%20Prolog%20to%20Guarded%20Definite%20Clauses%20-%20Matthew%20M.%20Huntbach.pdf wouldn’t need to exist. Generally, we need some way to ensure controllers can have some stable representation of the state of other controllers they work with, and to me this indicates transactional reasoning over monotonic state. Certain resources would have to be locked as part of a multi-phase commit. In other words, backtracking is impossible across transactions. Moreover, transactions definitely don’t solve everything, and there are extra problems that arise within this. Despite this, I think Datalog goes wrong in presuming a fixed program- rules should also be data. If this is the case, then the classical PROLOG model of ‘single search tree, single clause order, single memory store’ falls apart IMO.
In a single statement, PROLOG resists distribution because its design requires logical time travel over a single, global search tree.
Although I don’t think four-valued truth values are a requirement here, given that we’re, under this brute-force model, only talking about controllers and what they individually know rather than ‘totalising decisions’. However, we may want three-valued truths.
I still have yet to really truly understand where XSB fits in here, but my thinking indicates that an XSB style engine is useful for reasoning over snapshots of the system, and stealing ideas from it can help make event processing over a logical language easier, but may not be so helpful for defining how snapshots are agreed upon.
It might end up being the case that anything that requires dynamic assertions are ‘too hard’ to do within a distributed PROLOG-style model and that we can only really manage querying and events in this manner. If this is true, it’s imperative that we understand this- and generally the limits of what we can do. I imagine there will be a lot of things I got wrong here or missed, so please let me know if there is very critical supplementary information that could help make thinkin about this easier (or make the problem harder, which is also something that provides value).