DFKI-LT - RankEval: Open Tool for Evaluation of Machine-Learned Ranking
RankEval: Open Tool for Evaluation of Machine-Learned Ranking
1 The Prague Bulletin of Mathematical Linguistics volume 100,
Recent research and applications for evaluation and quality estimation of Machine Translation require statistical measures for comparing machine-predicted ranking against gold sets annotated by humans. Additional to the existing practice of measuring segment-level correlation with Kendall tau, we propose using ranking metrics from the research field of Information Retrieval such as Mean Reciprocal Rank, Normalized Discounted Cumulative Gain and Expected Reciprocal Rank. These reward systems that predict correctly the highest ranked items than the one of lower ones. We present an open source tool providing implementation of these metrics. It can be either run independently as a script supporting common formats or can be imported to any Python application.
Files: BibTeX, art-avramidis.pdf