Proceedings of the LREC 2016 Workshop "Translation Evaluation: From Fragmented Tools and Data Sets to an Integrated Ecosystem"

Georg Rehm; Aljoscha Burchardt; Ondrej Bojar; Christian Dugast; Marcello Federico; Josef van Genabith; Barry Haddow; Jan Hajic; Kimberley Harris; Philipp Koehn; Matteo Negri; Martin Popel; Lucia Specia; Marco Turchi; Hans Uszkoreit (Hrsg.)

Translation Evaluation: From Fragmented Tools and Data Sets to an Integrated Ecosystem, located at LREC 2016, May 24, Portorož, Slovenia, 5/2016.


Current approaches to Machine Translation (MT) or professional translation evaluation, both automatic and manual, are characterised by a high degree of fragmentation, heterogeneity and a lack of interoperability between methods, tools and data sets. As a consequence, it is difficult to reproduce, interpret, and compare evaluation results. In an attempt to address this issue, the main objective of this workshop is to bring together researchers working on translation evaluation and practitioners (translators, users of MT, language service providers etc.). This workshop takes an in-depth look at an area of ever-increasing importance. Two clear trends have emerged over the past several years. The first trend involves standardising evaluations in research through large shared tasks in which actual translations are compared to reference translations using automatic metrics or human ranking. The second trend focuses on achieving high quality (HQ) translations with the help of increasingly complex data sets that contain many levels of annotation based on sophisticated quality metrics – often organised in the context of smaller shared tasks. In industry, we also observe an increased interest in workflows for HQ outbound translation that ombine Translation Memories, MT, and post-editing. In stark contrast to this trend to quality translation and ist inherent overall approach and complexity, the data and tooling landscapes remain rather heterogeneous, uncoordinated and not interoperable. The event brings together researchers, users and providers of tools, and users and providers of manual and automatic translation evaluation methodologies. We want to initiate a dialogue and discuss whether the current approach involving a diverse and heterogeneous and distributed set of data, tools, scripts, and evaluation methodologies is appropriate enough or if the community should, instead, collaborate towards building an integrated ecosystem that provides better and more sustainable access to data sets, evaluation workflows, tools, approaches, and metrics that support processes such as annotations, quality comparisons and post-editing. The workshop is meant to stimulate a dialogue about the commonalities, similarities and differences of the existing solutions in the three areas (1) tools, (2) methodologies, (3) data sets. A key question concerns the high level of flexibility and lack of interoperability of heterogeneous approaches, while a homogeneous approach would provide less flexibility but higher interoperability and thus allow, e.g., integrated research by means of an MT app store (cf. The Translingual Cloud anticipated in the META-NET Strategic Research Agenda). How much flexibility and interoperability does the translation community need? How much does it want? How can communication and collaboration etween industry and research be intensified? We hope that the papers presented and discussed at the workshop provide at least partial answers on these, and other, crucial questions around the complex and interdisciplinary topic of evaluating translations, either produced by machines or by human experts.


Weitere Links

LREC-2016-MT-Eval-Workshop-Proceedings.pdf (pdf, 5 MB )

German Research Center for Artificial Intelligence
Deutsches Forschungszentrum für Künstliche Intelligenz