is used to represent feature structures.
Feature structures are represented in a
string-like notation where variables are prefixed with the sign %. Such
string notation is then transformed by a special
reader in some
internal Lisp structures.
Definite clause specifications are then
described in Lisp-list notation, where a definite clause is represented
as a list, where the first element represents the head and the remaining
elements represent the body of a clause. Each relation is represented as a
list, where the first element represents the relational atom and the remaining
elements represents the arguments of the relations directly specified as
feature structures in string notation. The following structure shows the Lisp
representation of rule (r ):
((sign "[(cat vp) (sc %Tail) (sem %Sem) (lex no) (v2 %V) (string [(d-l %p0) (e-l %p)]) (deriv [(name vp-sc) (dtrs [(first %1) (rest [(first %2) (rest end)])])])]") (sign "%Arg=[(sc end) (v2 no) (string [(d-l %p0) (e-l %p1)]) (deriv %1)]") (sign "[(cat vp) (sc [(first %Arg) (rest %Tail)]) (sem %Sem) (v2 %V) (string [(d-l %p1) (e-l %p)]) (deriv %2)]"))
Grammar and lexical rules are transformed in some internal representation, where the grammar and lexicon are stored as a hash table whose key values are lists of possible entries. For grammar rules the relation name of the head of a clause and its arity is used to construct a symbol which is used as the key in the hash table. Thus, the above rule would be inserted under the key SIGN/1. However, storing of grammar rules in a hash table in that way only makes sense if we use a set of different relational symbols. For the grammar in appendix A however we only used the relation name SIGN. In order to retrieve grammar rules efficiently, we therefore use the value of the feature CAT for determining the hash key. The effect is that all grammar rules with same category value are stored in a list under the same key. Our implementation can be parameterized with respect to this issue.
For lexical rules we actually use two different hash tables, where both point
to the same set of lexical entries (i.e., the same set of entries can be
retrieved from two different directions). For parsing
we use the first string element (which can be accessed via the path
,
and for the generation we are using the predicate name (via the path
).
Using these data structures, scanning is performed in two steps. First, for
the selected element of an active item, a possible key is determined which is
either found under the path
or
. If such a
key can be determined the corresponding entries in the lexicon are retrieved
and a list of possible candidates are returned (which can be the empty list,
if no entries exist for this current key). Each candidate is then
unified with the constraints of the selected element. If the constraints of
the selected element do not specify any information about a string or a
semantic information, no key can be determined. If this happens, the scanner
does not apply. This is also the case if no lexical entry for a key can be
determined.
In order to syntactically distinguish between definite clauses, that represent grammar rules and lexical rules, we prefix each clause either with the special grammar rule symbol ``;SPMlt;-'' or with the lexical entry symbol ``;SPMlt;;SPMlt;-''. Thus the above grammar rule would be rewritten as
(;SPMlt;- (sign ``...'') (sign `...'') (sign ``...''))
where as a lexical entry would be written as
(;SPMlt;;SPMlt;- (sign ``...''))
Both symbols are actually macros that call the respective functions.