Definite Clause Grammar Expansion

Jan Burse, created Oct 19. 2011 Don't we all want Candy? In our previous blog entry about goal expansion we have already shown how we can inline code via goal expansion. In this blog entry we will show how we can use goal expansion to sweeten our life in that we will extend the language of ordinary Prolog. We will show how definite clause grammars (DCGs) can be implemented by means of goal expansion. The newer versions of Jekejeke Prolog compile DCG grammars into ordinary Prolog. We will give a full account of a vanilla version of this compilation. DCGs are an extension of context free grammars. DCGs allow arbitrary tree structures be built in the course of parsing and they allow extra conditions dependent on auxiliary computations. During consult DCG rules are detected by the (-->)/2 main functor and then expanded. For our vanilla version of DCGs we will instead use the (==>)/2 main functor so that the experiment can be performed in any Prolog system.  Our vanilla DCG will only provide terminals, non-terminals and conjunction. Push backs, actions and control constructs are out of scope for the vanilla DCG. Our running example will be grammar for the acceptance of palindromes: palin ==> []. palin ==> [_]. palin ==> [Border], palin, [Border]. The only non-terminal in the above grammar is the grammar predicate “palin”. We then find an empty list of terminals ([]) in the first rule and singleton terminals both in the second ([_]) and in the third rule ([Border]). The third rule makes use of grammar conjunction to combine a singleton terminal, a non-terminal and a further singleton terminal. The example can be extended by attributes and can be used for both syntax analysis and synthesis. Corresponding example code can be found in the Jekejeke Prolog language reference. The DCG grammar rules are syntactic sugar for ordinary Prolog rules. The compilation will replace the DCG grammar rules by ordinary Prolog rules. In particular we will replace non-terminals by their extended counter parts that have two additional arguments. And we will replace a conjunction by the replacement of the operands whereby we will chain the additional arguments. Finally we will replace terminals by the predicate connects/3. This is in correspondence to the original proposal [1]. The predicate connects/3 has the following definition: connects([X | Y], X, Y). There are now different proposals around to perform the compilation. The draft proposal [2] suggests realizing a dedicated replacement rules for DCG. A different approach sees the replacement rules for DCG as special goal and term expansion rules. We will follow the later approach. But let’s first look at the implementation of the goal and term expansion rules. We can keep them very simple since we only need to support vanilla Prolog. We will call the corresponding predicates desugar_term/2 and desugar_goal/2. Their definition reads as follows: desugar_term((A :- B), (A :- C)) :- !, desugar_goal(B, C). desugar_term(A, C) :- term_desugaring(A, B), !, desugar_term(B, C). desugar_term(A, A). desugar_goal((A, B), (C, D)) :- !, desugar_goal(A, C), desugar_goal(B, D). desugar_goal(A, C) :- goal_desugaring(A, B), !, desugar_goal(B, C). desugar_goal(A, A). The goal expansion mechanism for vanilla Prolog has two variation points. We can extend it by defining rules for term_desugaring/2 and goal_desugaring/2. The vanilla DCG rules that we will plug-in will make use of a predicate extend/4. The predicate will add two additional arguments to the given goal. More details about its definition can be found in the attachment to this blog entry. To implement the replacement by goal expansion we will shift along a term of the form phrase(G, I, O) where G is the grammar goal and I, O are the additional arguments. term_desugaring((P ==> B), (Q :- phrase(B, I, O))) :- extend(P, I, O, Q). goal_desugaring(phrase((A, B), I, O), (phrase(A, I, H), phrase(B, H, O))). goal_desugaring(phrase([A], I, O), connects(I, A, O)). goal_desugaring(phrase([], I, I), true). goal_desugaring(phrase(P, I, O), Q) :- extend(P, I, O, Q). Both the rules for the grammar(==>)/2 and for the grammar (,)/2 delegate a further expansion to desugar_term/2 and desugar_goal/2. This is possible since these predicates implement a tail recursion loop after calling term_desugaring/2 and goal_desugaring/2 as can be seen by a simple code inspection.  The goal expansion is normally called from within the consult routine. For simplicity we will use a predicate convert/0 that explicitly maps the ==>/2 facts. ?- convert. Yes ?- listing(palin/2). :- dynamic palin / 2. palin(_A, _A). palin(_A, _C) :- connects(_A, _B, _C). palin(_A, _E) :- connects(_A, _B, _C), palin(_C, _D), connects(_D, _B, _E). We can correctly run a positive and a negative query against the compiled code: ?- palin("anna",[]). Yes ?- palin("bert",[]). No From the goal expansion for vanilla Prolog and from the rules for vanilla DCG a full implementation can be derived. The full implementation will not only need additional effort to cover everything from Prolog and DCG. One will also need to implement error detection. For example the term expansion of a variable should lead to an instantiation error. If we look more closely at our compilation result we see that we have lost all the variable names. A full implementation will need to carry over variable names, term positions, etc.. to support debugging of DCG code. Jekejeke Prolog currently only realizes a transfer of the variable names. Finally a full implementation might use the predicate ‘C’/3 known from the DEC10 Prolog instead of the predicate connects/3. The predicate ‘C’/3 is not very wide spread among Prolog system implementations. Many Prolog system implementations directly use ‘=’/2 or even try head unification. Unfortunately by departing from ‘C’/3 implementations loose extensibility as already explained in [1]. To allow easy extensibility we have therefore made ‘C’/3 multi-file in Jekejeke Prolog. This way the end-user can define additional rules for ‘C’/3 on his own. References [1] Pereira, F.C.N. and Warren, D.H.D. (1980): Definite Clause Grammars for Language Analysis – A Survey of the Formalism and a Comparison with Augmented Transition Networks, North-Holland Publishing Company, Artificial Intelligence, 13, 231 – 278 [2] Moura, P. et al. (2010): ISO/IEC DTR 13211 3:2006, Definite Clause Grammar Rules, Draft, April 1, 2010