Home

Awesome

DCGs for Parse Trees

Extend Definite Clause Grammars (DCG) for Prolog by an additional argument to automatically store the parse tree.

Synopsis

:- use_module(library(dcg4pt/expand)).

% Extended DCGs get expanded to hold an additional
%   argument with parse tree.
sentence --> noun_phrase(N), verb_phrase(N).
noun_phrase(N) --> determiner, noun(N).
verb_phrase(N) --> ( verb(N) ; verb(N), noun_phrase(_) ).
noun(sg) --> [boy]  ; [apple].
noun(pl) --> [boys] ; [apples].
determiner --> [the].
verb(sg) --> [eats].
verb(pl) --> [eat].

main :-
  phrase(sentence(Tree), [the, boy, eats, the, apples]),
  print_term(Tree, [indent_arguments(2)]).

% prints:
%   sentence([
%     noun_phrase([
%       determiner(the),
%       noun(boy) ]),
%     verb_phrase([
%       verb(eats),
%       noun_phrase([
%         determiner(the),
%         noun(apples) ]) ]) ])

Installation

This pack is available from the add-on registry of SWI-Prolog.

It can be installed with pack_install/1:

?- pack_install(dcg4pt).

Requirements

Only for development purposes the tap pack is needed:

?- pack_install(tap).

DCG Expansion

In most cases you simply want to automatically expand all given DCGs with the additional parse tree argument. To do so, simply call:

:- use_module(library(dcg4pt/expand)).
% and later the definition of DCGs

Additionally you can manually call the predicates to translate a DCG4PT rule:

?- use_module(library(dcg4pt)).
?- dcg4pt_rule_to_dcg_rule((sentence --> noun_phrase, verb_phrase), DCG).
DCG = (sentence(X) --> ( ... )).

Usage

You can use the generated predicates like normal DCGs, besides that they provide an additional argument to hold the parse tree. It is automatically added as the very last argument of the DCG body.

Bound Arguments

library(dcg4pt) has been implemented to provide a tool that generates a parse tree from a given input list but also the other way around, i.e., to generate a list based on a parse tree. So you can also use it this way:

?- Tree = noun_phrase([determiner(the), noun(boy)]),
   phrase(noun_phrase(N, Tree), List).
Tree = noun_phrase([determiner(the), noun(boy)]),
N = sg,
List = [the, boy] .

Sequences

library(dcg4pt) provides a built-in sequence(+Quantifier, :Body) DCG body to resolve sequences of Body with the quantifiers '?', '+', and '*' as known from regular expressions. Their occurrences are represented as (possibly empty) lists in the parse tree:

:- use_module(library(dcg4pt/expand)).

single --> [a].
list --> sequence('*', single).
non_empty_list --> sequence('+', single).
optional --> sequence('?', single).

main :-
  phrase(list(Tree), [a, a, a]),
  print_term(Tree, [indent_arguments(2)]).

% prints:
%   list([
%     single(a),
%     single(a),
%     single(a) ])

With a free variable given as the parse tree, the possibilities are generated beginning with the smallest solution:

?- phrase(non_empty_list(Tree), List).
Tree = non_empty_list([single(a)]),
List = [a] ;
Tree = non_empty_list([single(a), single(a)]),
List = [a, a] ;
...