Skip navigation.

The MOLOKO grammar

Version 2.2 (December 2005)

Introduction

The MOLOKO grammar defines a natural language grammar for a range of phenomena in the domain of "talking about things to make and do." MOLOKO is specified as a Combinatory Categorial Grammar, which can be processed with the OpenCCG development platform. OpenCCG enables you to use MOLOKO both for parsing (analysis) and for realization (production). For brief documentation of the grammar, see here

Download

The MOLOKO grammar is provided as open-source software, under the GNU Lesser General Public License.

Required: Java 1.5 (or higher), Apache Ant, OpenCCG 0.8.6 (or higher)
Author: Geert-Jan M. Kruijff, Sabrina Wilske (mail for comments, questions, suggestions)
Download: moloko.dec05.zip (922KB. December 12 2005)
Install: Unzip the downloaded archive. Run build.sh to start the Antishield installation shield.
Platforms: Platform independent. Tested on Mac OS.X 10.4.3, Linux (SuSe, RedHat, CentOS 4.1)
For an introduction into the XML- and XSL-level specification of grammars for OpenCCG, please see the "Rough guide" that comes with OpenCCG.

Brief documentation for the MOLOKO grammar

The grammar distribution consists of two parts: (1) a core grammar [ grammars/openccg/core/ ], and (2) the MOLOKO grammar [ grammars/openccg/moloko/ ]. The MOLOKO grammar primarily focuses on expressions that are used in the context of talking about objects and their properties, spatial organization of objects in a local visual scene, as well as rooms. The grammar handles basic forms of assertions, commands, and questions (factive as well as polar). Sample interactions for the use of the MOLOKO grammar in the setting of supervised spatial exploration can be found in this PDF. Most of the families for the MOLOKO grammar are described in this PDF (draft!), though please note that since the creation of that draft document some structures have slightly changed. For the latest documentation, please refer to the documentation in the grammar specifications.
The CORE grammar
The core grammar defines basic atomic categories, lexical families, and macros for SVO languages:

cats.xsl Atomic categories
dict.xsl Lexical macros
adj.xsl Lexical families: adjectives
adv.xsl Lexical families: adverbials
coord.xslLexical families: coordination
cue.xsl Lexical families: cue words
det.xsl Lexical families: determiners
expl.xsl Lexical families: expletives
n.xsl Lexical families: nouns
neg.xsl Lexical families: negation
q.xsl Lexical families: questions
v.X.Y.xslLexical families: verbs (modal, copula, auxiliary, infinitival; intransitive, transitive)

The file core/lexicon.xsl includes a full list of the kinds of lexical families. The MOLOKO grammar references this file, to (re)use the families specified in the core.

The MOLOKO grammar
In addition, the MOLOKO grammar specifes lexical families for the following types:

ctxtref.xsl Lexical families: contextual reference
du.xsl Lexical families: discourse units
pp.xsl Lexical families: prepositional phrases
ppspat.xsl Lexical families: spatial prepositions
ctxtref.xsl Lexical families: punctuation (comma categories)

The file moloko/dict.xsl specifies the complete dictionary for the MOLOKO grammar. Example expressions are provided in the moloko/testbed.xml test suite. The (flat) ontology currently being used in the grammar is given in moloko/types.xsl.

Latest News

December 12 2005
v2.2 download available