RedBall

This metric is resulted from a tool originally developed by RWS Moravia for computing post-editor’s productivity - approximate (and comparable) time spent on post-editing job. Tool analyses two identical sets of text (one coming as an MT result and second as a final post-edited text) and identifies following six types of change (at word-level) with appropriate number of such changes:

  • Inserted words

  • Removed words

  • Updated words

  • Updated and moved (in the sentence) words

  • Unchanged words

  • Unchanged and moved (in the sentence) words

Each type of change is then rated by an approximate time spent on such change - Time Cost (e.g., time spent on deletion of a single word). The result of RedBall Tool is then sum of changes multiplied by its Time Cost:

Overall Productivity of Post-Editing =
            [(number of Inserted words * Inserted Words Time Cost) +
            (number of Removed words * Removed words Time Cost) +
            (number of Updated words * Updated words Time Cost) +
            (number of Updated and moved words * Updated and moved words Time Cost) +
            (number of Unchanged words * Unchanged words Time Cost) +
            (number of Unchanged and moved words * Unchanged and moved words Time Cost)]

Time Cost for particular type of change can be parametrized individually and can be different for a different customers/content type/language. Overall Productivity is then a very subjective number which can be compared with another productivity only with the same set of settings - usually done on a single localization project to compare a delivery times.

Implementation Details

Redball TER: Edit Distance counted from productivities.

We were thinking about a standardization of the output from RedBall tool and its usage for comparing a quality of MT output. For such measuring, we need a standard time cost applied in RedBall tool on MT output and reference translation - the result is a productivity needed to achieve a text of reference material from the MT output. In addition, we need a “reference” productivity to compare previous productivity with. As a reference productivity, we used a productivity of translation content from scratch (all content was treated as inserted words) which was also computed with RedBall tool using standardized Time Cost settings.

RedBall TER = (Productivity to achieve reference text from MT) / (Reference Productivity)

Standardized Time Costs (TC) used for RedBall TER computing are following (all in seconds per word):

  • TC for Inserted = 28800 / 2200 = 13.09 s

  • TC for Removed = 3 s (time of reading and removing)

  • TC for Unchanged = 2 s (time of reading)

  • TC for Unchanged Moved = 5 s (time of reading, copying and pasting)

  • TC for Updated = {TC for Inserted} / 2 = 6.55 s

  • TC for Updated Moved = {TC for Updated} + {TC for Unchanged Moved} = 11.55 s

Where TC for Inserted is calculated as {number of seconds of working day (8hours * 60mins * 60s} / {Average daily productivity of translator (new words/day)}