SProUTomat

SProUTomat is a tool that daily and automatically builds the SProUT system from its Java source files and all compilable lingware from the sources in the SProUT CVS (now SVN) repository, tests named entity grammars on a test corpus and generates a report that also graphically shows changes in precision and recall over time.

Although SProUTomat is specific for the SProUT system, it contains interesting parts that could also be re-used for automatic testing of HPSG grammars (e.g. SProUTomat already contains methods for compiling TDL grammar files with flop for PET).

Measuring precision and recall is performed through the JTaCo (cf. Bering et. al 2003 available from the SProUT webpage) system that compares XML markup of an annotated corpus with SProUT results. A method is currently being developed that could be used to compare two different XML markups independently of a SProUT result comparison part, i.e. independently form SProUT grammars.

SProUTomat also generates the complete runtime subsystem of SProUT for the Heart of Gold middleware with the current grammars for named entity recognition (German, English, Greek, Japanese) and the ChunkieRMRS cascade grammars for German and English (four NE grammars and eight cascade grammars, runtime JAR and SProUTput applet).

SProUTomat is mainly based on an ant script that contains all the necessary targets for building and testing.

Here are is some sketchy description of what SProUTomat does:

  1. update source files (Java, grammars, other resources) from the version control system
  2. Java-compile system (SProUT GUI, runtime, applet)
  3. compile resources
  4. build runtime component for Heart of Gold including lingware
  5. test grammars with JTaCo and generate report and comparison graphics
  6. build Javadoc documentation
  7. build Ant buildfile documentation and dependency graph
  8. send notification email with OK or ERROR indicating the overall success and link to generated SProUTomat report
  9. Publication (bibtex): Contact: Ulrich Schäfer.