Skip to content

Commit 76a2c5e

Browse files
committed
Copy chalk_engine README
1 parent 9fe13e7 commit 76a2c5e

File tree

1 file changed

+277
-2
lines changed

1 file changed

+277
-2
lines changed

src/traits/slg.md

Lines changed: 277 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,278 @@
1-
# The SLG solver
1+
# The On-Demand SLG solver
22

3-
TODO: <https://github.com/rust-lang-nursery/chalk/blob/master/chalk-engine/src/README.md>
3+
## Description of how it works
4+
5+
The basis of the solver is the `Forest` type. A *forest* stores a
6+
collection of *tables* as well as a *stack*. Each *table* represents
7+
the stored results of a particular query that is being performed, as
8+
well as the various *strands*, which are basically suspended
9+
computations that may be used to find more answers. Tables are
10+
interdependent: solving one query may require solving others.
11+
12+
### Walkthrough
13+
14+
Perhaps the easiest way to explain how the solver works is to walk
15+
through an example. Let's imagine that we have the following program:
16+
17+
```rust
18+
trait Debug { }
19+
20+
struct u32 { }
21+
impl Debug for u32 { }
22+
23+
struct Rc<T> { }
24+
impl<T: Debug> Debug for Rc<T> { }
25+
26+
struct Vec<T> { }
27+
impl<T: Debug> Debug for Vec<T> { }
28+
```
29+
30+
Now imagine that we want to find answers for the query `exists<T> {
31+
Rc<T>: Debug }`. The first step would be to u-canonicalize this query; this
32+
is the act of giving canonical names to all the unbound inference variables based on the
33+
order of their left-most appearance, as well as canonicalizing the universes of any
34+
universally bound names (e.g., the `T` in `forall<T> { ... }`). In this case, there are no
35+
universally bound names, but the canonical form Q of the query might look something like:
36+
37+
Rc<?0>: Debug
38+
39+
where `?0` is a variable in the root universe U0. We would then go and
40+
look for a table with this as the key: since the forest is empty, this
41+
lookup will fail, and we will create a new table T0, corresponding to
42+
the u-canonical goal Q.
43+
44+
**Ignoring negative reasoning and regions.** To start, we'll ignore
45+
the possibility of negative goals like `not { Foo }`. We'll phase them
46+
in later, as they bring several complications.
47+
48+
**Creating a table.** When we first create a table, we also initialize
49+
it with a set of *initial strands*. A "strand" is kind of like a
50+
"thread" for the solver: it contains a particular way to produce an
51+
answer. The initial set of strands for a goal like `Rc<?0>: Debug`
52+
(i.e., a "domain goal") is determined by looking for *clauses* in the
53+
environment. In Rust, these clauses derive from impls, but also from
54+
where-clauses that are in scope. In the case of our example, there
55+
would be three clauses, each coming from the program. Using a
56+
Prolog-like notation, these look like:
57+
58+
```
59+
(u32: Debug).
60+
(Rc<T>: Debug) :- (T: Debug).
61+
(Vec<T>: Debug) :- (T: Debug).
62+
```
63+
64+
To create our initial strands, then, we will try to apply each of
65+
these clauses to our goal of `Rc<?0>: Debug`. The first and third
66+
clauses are inapplicable because `u32` and `Vec<?0>` cannot be unified
67+
with `Rc<?0>`. The second clause, however, will work.
68+
69+
**What is a strand?** Let's talk a bit more about what a strand *is*. In the code, a strand
70+
is the combination of an inference table, an X-clause, and (possibly)
71+
a selected subgoal from that X-clause. But what is an X-clause
72+
(`ExClause`, in the code)? An X-clause pulls together a few things:
73+
74+
- The current state of the goal we are trying to prove;
75+
- A set of subgoals that have yet to be proven;
76+
- There are also a few things we're ignoring for now:
77+
- delayed literals, region constraints
78+
79+
The general form of an X-clause is written much like a Prolog clause,
80+
but with somewhat different semantics. Since we're ignoring delayed
81+
literals and region constraints, an X-clause just looks like this:
82+
83+
G :- L
84+
85+
where G is a goal and L is a set of subgoals that must be proven.
86+
(The L stands for *literal* -- when we address negative reasoning, a
87+
literal will be either a positive or negative subgoal.) The idea is
88+
that if we are able to prove L then the goal G can be considered true.
89+
90+
In the case of our example, we would wind up creating one strand, with
91+
an X-clause like so:
92+
93+
(Rc<?T>: Debug) :- (?T: Debug)
94+
95+
Here, the `?T` refers to one of the inference variables created in the
96+
inference table that accompanies the strand. (I'll use named variables
97+
to refer to inference variables, and numbered variables like `?0` to
98+
refer to variables in a canonicalized goal; in the code, however, they
99+
are both represented with an index.)
100+
101+
For each strand, we also optionally store a *selected subgoal*. This
102+
is the subgoal after the turnstile (`:-`) that we are currently trying
103+
to prove in this strand. Initally, when a strand is first created,
104+
there is no selected subgoal.
105+
106+
**Activating a strand.** Now that we have created the table T0 and
107+
initialized it with strands, we have to actually try and produce an
108+
answer. We do this by invoking the `ensure_answer` operation on the
109+
table: specifically, we say `ensure_answer(T0, A0)`, meaning "ensure
110+
that there is a 0th answer".
111+
112+
Remember that tables store not only strands, but also a vector of
113+
cached answers. The first thing that `ensure_answer` does is to check
114+
whether answer 0 is in this vector. If so, we can just return
115+
immediately. In this case, the vector will be empty, and hence that
116+
does not apply (this becomes important for cyclic checks later on).
117+
118+
When there is no cached answer, `ensure_answer` will try to produce
119+
one. It does this by selecting a strand from the set of active
120+
strands -- the strands are stored in a `VecDeque` and hence processed
121+
in a round-robin fashion. Right now, we have only one strand, storing
122+
the following X-clause with no selected subgoal:
123+
124+
(Rc<?T>: Debug) :- (?T: Debug)
125+
126+
When we activate the strand, we see that we have no selected subgoal,
127+
and so we first pick one of the subgoals to process. Here, there is only
128+
one (`?T: Debug`), so that becomes the selected subgoal, changing
129+
the state of the strand to:
130+
131+
(Rc<?T>: Debug) :- selected(?T: Debug, A0)
132+
133+
Here, we write `selected(L, An)` to indicate that (a) the literal `L`
134+
is the selected subgoal and (b) which answer `An` we are looking for. We
135+
start out looking for `A0`.
136+
137+
**Processing the selected subgoal.** Next, we have to try and find an
138+
answer to this selected goal. To do that, we will u-canonicalize it
139+
and try to find an associated table. In this case, the u-canonical
140+
form of the subgoal is `?0: Debug`: we don't have a table yet for
141+
that, so we can create a new one, T1. As before, we'll initialize T1
142+
with strands. In this case, there will be three strands, because all
143+
the program clauses are potentially applicable. Those three strands
144+
will be:
145+
146+
- `(u32: Debug) :-`, derived from the program clause `(u32: Debug).`.
147+
- Note: This strand has no subgoals.
148+
- `(Vec<?U>: Debug) :- (?U: Debug)`, derived from the `Vec` impl.
149+
- `(Rc<?U>: Debug) :- (?U: Debug)`, derived from the `Rc` impl.
150+
151+
We can thus summarize the state of the whole forest at this point as
152+
follows:
153+
154+
```
155+
Table T0 [Rc<?0>: Debug]
156+
Strands:
157+
(Rc<?T>: Debug) :- selected(?T: Debug, A0)
158+
159+
Table T1 [?0: Debug]
160+
Strands:
161+
(u32: Debug) :-
162+
(Vec<?U>: Debug) :- (?U: Debug)
163+
(Rc<?V>: Debug) :- (?V: Debug)
164+
```
165+
166+
**Delegation between tables.** Now that the active strand from T0 has
167+
created the table T1, it can try to extract an answer. It does this
168+
via that same `ensure_answer` operation we saw before. In this case,
169+
the strand would invoke `ensure_answer(T1, A0)`, since we will start
170+
with the first answer. This will cause T1 to activate its first
171+
strand, `u32: Debug :-`.
172+
173+
This strand is somewhat special: it has no subgoals at all. This means
174+
that the goal is proven. We can therefore add `u32: Debug` to the set
175+
of *answers* for our table, calling it answer A0 (it is the first
176+
answer). The strand is then removed from the list of strands.
177+
178+
The state of table T1 is therefore:
179+
180+
```
181+
Table T1 [?0: Debug]
182+
Answers:
183+
A0 = [?0 = u32]
184+
Strand:
185+
(Vec<?U>: Debug) :- (?U: Debug)
186+
(Rc<?V>: Debug) :- (?V: Debug)
187+
```
188+
189+
Note that I am writing out the answer A0 as a substitution that can be
190+
applied to the table goal; actually, in the code, the goals for each
191+
X-clause are also represented as substitutions, but in this exposition
192+
I've chosen to write them as full goals, following NFTD.
193+
194+
Since we now have an answer, `ensure_answer(T1, A0)` will return `Ok`
195+
to the table T0, indicating that answer A0 is available. T0 now has
196+
the job of incorporating that result into its active strand. It does
197+
this in two ways. First, it creates a new strand that is looking for
198+
the next possible answer of T1. Next, it incorpoates the answer from
199+
A0 and removes the subgoal. The resulting state of table T0 is:
200+
201+
```
202+
Table T0 [Rc<?0>: Debug]
203+
Strands:
204+
(Rc<?T>: Debug) :- selected(?T: Debug, A1)
205+
(Rc<u32>: Debug) :-
206+
```
207+
208+
We then immediately activate the strand that incorporated the answer
209+
(the `Rc<u32>: Debug` one). In this case, that strand has no further
210+
subgoals, so it becomes an answer to the table T0. This answer can
211+
then be returned up to our caller, and the whole forest goes quiescent
212+
at this point (remember, we only do enough work to generate *one*
213+
answer). The ending state of the forest at this point will be:
214+
215+
```
216+
Table T0 [Rc<?0>: Debug]
217+
Answer:
218+
A0 = [?0 = u32]
219+
Strands:
220+
(Rc<?T>: Debug) :- selected(?T: Debug, A1)
221+
222+
Table T1 [?0: Debug]
223+
Answers:
224+
A0 = [?0 = u32]
225+
Strand:
226+
(Vec<?U>: Debug) :- (?U: Debug)
227+
(Rc<?V>: Debug) :- (?V: Debug)
228+
```
229+
230+
Here you can see how the forest captures both the answers we have
231+
created thus far *and* the strands that will let us try to produce
232+
more answers later on.
233+
234+
## Heritage and acroynms
235+
236+
This solver implements the SLG solving technique, though extended to
237+
accommodate hereditary harrop (HH) predicates, as well as the needs of
238+
lazy normalization.
239+
240+
Its design is kind of a fusion of [MiniKanren] and the following
241+
papers, which I will refer to as EWFS and NTFD respectively:
242+
243+
> Efficient Top-Down Computation of Queries Under the Well-formed Semantics
244+
> (Chen, Swift, and Warren; Journal of Logic Programming '95)
245+
246+
> A New Formulation of Tabled resolution With Delay
247+
> (Swift; EPIA '99)
248+
249+
[MiniKanren]: http://minikanren.org/
250+
251+
In addition, I incorporated extensions from the following papers,
252+
which I will refer to as SA and RR respectively, that describes how to
253+
do introduce approximation when processing subgoals and so forth:
254+
255+
> Terminating Evaluation of Logic Programs with Finite Three-Valued Models
256+
> Riguzzi and Swift; ACM Transactions on Computational Logic 2013
257+
> (Introduces "subgoal abstraction", hence the name SA)
258+
>
259+
> Radial Restraint
260+
> Grosof and Swift; 2013
261+
262+
Another useful paper that gives a kind of high-level overview of
263+
concepts at play is the following:
264+
265+
> XSB: Extending Prolog with Tabled Logic Programming
266+
> (Swift and Warren; Theory and Practice of Logic Programming '10)
267+
268+
There are a places where I intentionally diverged from the semantics
269+
as described in the papers -- e.g. by more aggressively approximating
270+
-- which I marked them with a comment DIVERGENCE. Those places may
271+
want to be evaluated in the future.
272+
273+
A few other acronyms that I use:
274+
275+
- WAM: Warren abstract machine, an efficient way to evaluate Prolog programs.
276+
See <http://wambook.sourceforge.net/>.
277+
- HH: Hereditary harrop predicates. What Chalk deals in.
278+
Popularized by Lambda Prolog.

0 commit comments

Comments
 (0)