The archive contains data gathered within ROMIP Machine Translation evaluaiton track in 2013.
The dataset contains 945 English sentences (
original_sentences.txt) and their transalations into Russian made by human translators (
professional_translations.txt) and machines (
machine_translations.txt). There are in total 11 runs (seven submitted by participants and four from free online systems OS1..OS4).
Manual evaluation was carried out on 330 sentences for eight runs in total (four aprticipants' runs and four online systems). File
all_assessments.txt contains complete assessment results, while files
assessments_system1_system2.txt contain pairwise system comparisons. The former file lists tab-separated sentence IDs, assessor IDs, and system names with corresponding scores (2 marks the better translation of two, two 1s mean tie). There is a slight overlap in the assessment: translations of 60 sentences were judged by two assessors.
Additionally, the dataset includes a log (
crowd_translation_log.csv) of crowd-sourced translation of test sentences on the translatedby platform.
If you publish work based on the data, please quote the following paper:
Pavel Braslavski, Alexander Beloborodov, Maxim Khalilov, Serge Sharoff. English-->Russian MT åvaluation óampaign. In Proc. of the ACL-2013, Vol. 2: Short Papers, 2013, pp. 262--267. (PDF, bib)
The paper also explains details on data preparation and evaluation methodology.
May 21, 2013