rgbF: An Open Source Tool for n-gram Based Automatic Evaluation of Machine Translation Output

Maja Popovic

In: The Prague Bulletin of Mathematical Linguistics (PBML), Vol. 98, Pages 99-108, Charles University, Prague, 10/2012.


We describe rgbF, a tool for automatic evaluation of machine translation output based on n-gram precision and recall. The tool calculates the F-score averaged on all n-grams of an arbitrary set of distinct units such as words, morphemes, POS tags, etc. The arithmetic mean is used for n-gram averaging. As input, the tool requires reference translation(s) and hypothesis, both containing the same combination of units. The default output is the document level 4-gram F-score of the desired unit combination. The scores at the sentence level can be obtained on demand, as well as precision and/or recall scores, separate unit scores and separate n-gram scores. In addition, weights can be introduced both for n-grams and for units, as well as a desired n-gram order n.


art-popovic.pdf (pdf, 136 KB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence