**rgbF: An Open Source Tool for n-gram Based Automatic Evaluation of Machine Translation Output**

*The Prague Bulletin of Mathematical Linguistics volume 98,*Pages 99-108 , Charles University, Prague, 10/2012

We describe rgbF, a tool for automatic evaluation of machine translation output based on n-gram precision and recall. The tool calculates the F-score averaged on all n-grams of an arbitrary set of distinct units such as words, morphemes, POS tags, etc. The arithmetic mean is used for n-gram averaging. As input, the tool requires reference translation(s) and hypothesis, both containing the same combination of units. The default output is the document level 4-gram F-score of the desired unit combination. The scores at the sentence level can be obtained on demand, as well as precision and/or recall scores, separate unit scores and separate n-gram scores. In addition, weights can be introduced both for n-grams and for units, as well as a desired n-gram order n.

