![]()
| ![]() |
Machine Translation Evaluation Workshop & Shared Task
Aim & Scope
The aims of the workshop are (1) to develop a common testbed and (2) to evaluate different machine translation approaches for English-Russian pair. Translation will be evaluated on an unseen test set using different machine translation evaluation methods. The workshop is organized by the Russian Information Retrieval Evaluation Seminar (ROMIP) in cooperation with TAUS Labs. The workshop is open to all kinds of MT systems and technologies. Experienced and early-stage researchers, as well as industrial developers, are welcome to participate in the evaluation campaign and the workshop. The workshop will take place following the Dialog conference on computational linguistics and intelligent technologies. The conference is traditionally held in the end of May/beginning of June in the Moscow suburbs.
A test dataset of about 150,000 sentences originally written in English will be made available to the participants. The participants will be requested to submit the whole dataset translated into Russian. The dataset consists of about 10,800 news articles concatenated into a single file. The sentences in the file are continuously numbered, each sentence in a separate line; a blank line separates documents. A tab delimits sentence number and sentence itself. Since dataset is prepared in a fully automated manner, some flaws are presented in data (e.g. incorrect sentence boundaries, noisy document markup, etc.). Please refer to an example. The format of the results is the same: sentence number, tab, translation. All senetence numbers must be preserved in the result file. In case your system rejected the setence for some reason please preserve the line beginning with the sentence number. Participants submit their resuts by sending a link to a zipped file to organizers. The participants are free to use their own systems and any data to complete the task. For example, participants can use the following freely available resources:
The evaluation will be completed on about 1,000 sentences from the test dataset. Human translators will translate these sentences to ensure the gold standard quality. We will employ two types of evaluation measures:
We expect the participants to share organizational costs either by taking part in human assessment or in monetary form.
20 December 2012 - announcement & data samples
Potential participants are asked to fill an online form so the organizers could estimate participation and resources needed for evaluation. Filling out the form does not imply any obligations.
Pavel Braslavski (Kontur Labs/Ural Federal University)
|