Call for Papers

Note: The deadline for submission has passed.

While a significant body of work has been done by the machine translation (MT) research community towards the development and meta-evaluation of automatic metrics to assess overall MT quality, less attention has been dedicated to more operational evaluation metrics aimed at testing whether translations are adequate within a specific context: purpose, end-user, task, etc., and why the MT system fails in some cases. Both of these can benefit from some form of manual analysis. Most work in this area is limited to productivity tests (e.g. contrasting time for human translation and MT post-editing). A few initiatives consider more detailed metrics for the problem, which can also be used to understand and diagnose errors in MT systems. These include the Multidimensional Quality Metrics (MQM) recently proposed by the EU F7 project QTLaunchPad, the TAUS Dynamic Quality Framework, and past projects such as FEMTI, EAGLES and ISLE. Some of these metrics are also applicable to human translation evaluation. A number of task-based metrics have also been proposed for applications such as topic ID / triage and reading comprehension.

The purpose of this workshop is to bring together representatives from academia, industry and government institutions to discuss and assess metrics for manual quality evaluation and compare them through correlation analysis with well-established metrics for automatic evaluation such as BLEU, METEOR and others, as well as reference-less metrics for quality prediction.

The workshop will benefit from datasets already collected and manually annotated for translation errors by the QTLaunchPad project as part of a shared task on error annotation and automatic quality translation.


We will accept two types of submissions:

  • Abstract (of up to one page) OR
  • One-page abstract plus full paper (6-10 pages)

For formatting guidelines for full papers, please use the LREC submission format found at

Both abstracts and full papers will address any of the topics included in this CFP (see below), but full papers have the advantage of presenting the authors’ work and ideas at a greater level of detail. All submissions (abstract-only submissions or abstract+paper submissions) must be received by the submission deadline below and will be reviewed by experts in the field. Short slots for oral presentation will be given to all accepted submissions, regardless of their format (abstract only or abstract+paper). The workshop proceedings will consist of abstracts from all accepted submissions and select full papers from abstract+paper submissions.


The workshop welcomes submissions on the topics of

  • task-based translation evaluation metrics: specifically, metrics for machine (and/or human) translation quality evaluation and quality estimation, be these metrics automatic, semi-automatic or manual (rubric, error annotation, etc.),
  • error analysis of machine (and human) translations (automated and manual): for example studies exploiting whether manually annotated translations can contribute to the automatic detection of specific translation errors and whether this can be used to automatically correct translations,
  • correlation between translation evaluation metrics, error analysis, and task-suitability of translations.

Submission Information

The format of the workshop will be a half-day of short presentations on the above topics, followed by a half-day of hands-on collaborative work with MT metrics that show promise for the prediction of task suitability of MT output. The afternoon hands-on work will follow from the morning's presentations. Thus, all submissions, both abstracts and abstracts + papers should address at least the following points:

  • definition of the metric(s) being proposed, along with an indication of whether the metric is manual or automated,
  • method of computation of the metric(s), if not already well-known,
  • discussion of the applicability of the metric(s) to determining task suitability of MT output, and
  • indication of human (annotation) effort necessary to produce the metric(s).

Submissions will be addressed via the START Conference Manager:

Important Dates

  • Submission of Abstract or Abstract plus Paper: February 7, 2014. Deadline has been extended until February 14, 2014. [NOTE: Author(s) intending to submit a full paper must submit the full paper along with the abstract in order for it to be considered for inclusion in the workshop and publication in the workshop proceedings. Papers cannot be added to abstract-only submissions after the notification of acceptance.]
  • Notification to authors: March 10, 2014
  • Camera-ready versions of accepted Abstract or Abstract plus Paper due to organizing committee: March 28, 2014
  • Workshop Date: May 26, 2014

Organizing Committee

  • Keith J. Miller (MITRE)
  • Lucia Specia (University of Sheffield)
  • Kim Harris (GALA and text & form)
  • Stacey Bailey (MITRE)

Share your LRs

When making a submission from the START page, authors will be asked to provide essential information about resources (in a broad sense, i.e. also technologies, standards, evaluation kits, etc.) that have been used for the work described in the paper or are a new result of your research. Moreover, ELRA encourages all LREC authors to share the described LRs (data, tools, services, etc.), to enable their reuse, replicability of experiments, including evaluation ones, etc.