TER

Translation Edit Rate is one of the standard error metrics for automatic evaluation [Snover_2006]. The whole idea is described in the Research paper. Briefly, the formula is following:

TER = # of edits / average # of reference words

The value of TER is between 0 and 1, where lower is better.

Implementation Details

The current implementation is based on SacreBleu library [SacreBleu], which contains TER module.

Our settings are following:

we use no case distinction (TER default)

we apply normalization (for Chinese + Japanese languages the specific tokenization is applied as well)

we remove the brackets around tags (<bold> becomes bold)

For Thai we also apply the Apache OpenNLP tokenizer

Reference

[Snover_2006]

Matthew Snover, Bonnie Dorr, Richard Schwartz, Linnea Micciulla and John Makhoul: A Study of Translation Edit Rate with Targeted Human Annotation. Proceedings of Association for Machine Translation in the Americas. 2006.

[SacreBleu]

https://github.com/mjpost/sacrebleu/tree/master?tab=readme-ov-file#ter