DFKI-LT - RankEval: Open Tool for Evaluation of Machine-Learned Ranking

Eleftherios Avramidis
RankEval: Open Tool for Evaluation of Machine-Learned Ranking
in: Eva Hajičová (ed.):
1 The Prague Bulletin of Mathematical Linguistics volume 100, Pages 63-72, Charles University in Prague, Prague, Czech Republic, 9/2013
Recent research and applications for evaluation and quality estimation of Machine Translation require statistical measures for comparing machine-predicted ranking against gold sets annotated by humans. Additional to the existing practice of measuring segment-level correlation with Kendall tau, we propose using ranking metrics from the research field of Information Retrieval such as Mean Reciprocal Rank, Normalized Discounted Cumulative Gain and Expected Reciprocal Rank. These reward systems that predict correctly the highest ranked items than the one of lower ones. We present an open source tool providing implementation of these metrics. It can be either run independently as a script supporting common formats or can be imported to any Python application.
Files: BibTeX, art-avramidis.pdf