RedBall
This metric is resulted from a tool originally developed by RWS Moravia for computing post-editor’s productivity - approximate (and comparable) time spent on post-editing job. Tool analyses two identical sets of text (one coming as an MT result and second as a final post-edited text) and identifies following six types of change (at word-level) with appropriate number of such changes:
Inserted words
Removed words
Updated words
Updated and moved (in the sentence) words
Unchanged words
Unchanged and moved (in the sentence) words
Each type of change is then rated by an approximate time spent on such change - Time Cost (e.g., time spent on deletion of a single word). The result of RedBall Tool is then sum of changes multiplied by its Time Cost:
Overall Productivity of Post-Editing =
[(number of Inserted words * Inserted Words Time Cost) +
(number of Removed words * Removed words Time Cost) +
(number of Updated words * Updated words Time Cost) +
(number of Updated and moved words * Updated and moved words Time Cost) +
(number of Unchanged words * Unchanged words Time Cost) +
(number of Unchanged and moved words * Unchanged and moved words Time Cost)]
Time Cost for particular type of change can be parametrized individually and can be different for a different customers/content type/language. Overall Productivity is then a very subjective number which can be compared with another productivity only with the same set of settings - usually done on a single localization project to compare a delivery times.
Implementation Details
Redball TER: Edit Distance counted from productivities.
We were thinking about a standardization of the output from RedBall tool and its usage for comparing a quality of MT output. For such measuring, we need a standard time cost applied in RedBall tool on MT output and reference translation - the result is a productivity needed to achieve a text of reference material from the MT output. In addition, we need a “reference” productivity to compare previous productivity with. As a reference productivity, we used a productivity of translation content from scratch (all content was treated as inserted words) which was also computed with RedBall tool using standardized Time Cost settings.
RedBall TER = (Productivity to achieve reference text from MT) / (Reference Productivity)
Standardized Time Costs (TC) used for RedBall TER computing are following (all in seconds per word):
TC for Inserted = 28800 / 2200 = 13.09 s
TC for Removed = 3 s (time of reading and removing)
TC for Unchanged = 2 s (time of reading)
TC for Unchanged Moved = 5 s (time of reading, copying and pasting)
TC for Updated = {TC for Inserted} / 2 = 6.55 s
TC for Updated Moved = {TC for Updated} + {TC for Unchanged Moved} = 11.55 s
Where TC for Inserted is calculated as {number of seconds of working day (8hours * 60mins * 60s} / {Average daily productivity of translator (new words/day)}