Spaces:
Runtime error
Runtime error
| title: CharacTER | |
| emoji: π€ | |
| colorFrom: orange | |
| colorTo: red | |
| sdk: gradio | |
| sdk_version: 3.19.1 | |
| app_file: app.py | |
| pinned: false | |
| tags: | |
| - evaluate | |
| - metric | |
| - machine-translation | |
| description: >- | |
| CharacTer is a character-level metric inspired by the commonly applied translation edit rate (TER). | |
| # Metric Card for CharacTER | |
| ## Metric Description | |
| CharacTer is a character-level metric inspired by the translation edit rate (TER) metric. It is | |
| defined as the minimum number of character edits required to adjust a hypothesis, until it completely matches the | |
| reference, normalized by the length of the hypothesis sentence. CharacTer calculates the character level edit | |
| distance while performing the shift edit on word level. Unlike the strict matching criterion in TER, a hypothesis | |
| word is considered to match a reference word and could be shifted, if the edit distance between them is below a | |
| threshold value. The Levenshtein distance between the reference and the shifted hypothesis sequence is computed on the | |
| character level. In addition, the lengths of hypothesis sequences instead of reference sequences are used for | |
| normalizing the edit distance, which effectively counters the issue that shorter translations normally achieve lower | |
| TER. | |
| ## Intended Uses | |
| CharacTER was developed for machine translation evaluation. | |
| ## How to Use | |
| ```python | |
| import evaluate | |
| character = evaluate.load("character") | |
| # Single hyp/ref | |
| preds = ["this week the saudis denied information published in the new york times"] | |
| refs = ["saudi arabia denied this week information published in the american new york times"] | |
| results = character.compute(references=refs, predictions=preds) | |
| # Corpus example | |
| preds = ["this week the saudis denied information published in the new york times", | |
| "this is in fact an estimate"] | |
| refs = ["saudi arabia denied this week information published in the american new york times", | |
| "this is actually an estimate"] | |
| results = character.compute(references=refs, predictions=preds) | |
| ``` | |
| ### Inputs | |
| - **predictions**: a single prediction or a list of predictions to score. Each prediction should be a string with | |
| tokens separated by spaces. | |
| - **references**: a single reference or a list of reference for each prediction. Each reference should be a string with | |
| tokens separated by spaces. | |
| ### Output Values | |
| *=only when a list of references/hypotheses are given | |
| - **count** (*): how many parallel sentences were processed | |
| - **mean** (*): the mean CharacTER score | |
| - **median** (*): the median score | |
| - **std** (*): standard deviation of the score | |
| - **min** (*): smallest score | |
| - **max** (*): largest score | |
| - **cer_scores**: all scores, one per ref/hyp pair | |
| ### Output Example | |
| ```python | |
| { | |
| 'count': 2, | |
| 'mean': 0.3127282211789254, | |
| 'median': 0.3127282211789254, | |
| 'std': 0.07561653111280243, | |
| 'min': 0.25925925925925924, | |
| 'max': 0.36619718309859156, | |
| 'cer_scores': [0.36619718309859156, 0.25925925925925924] | |
| } | |
| ``` | |
| ## Citation | |
| ```bibtex | |
| @inproceedings{wang-etal-2016-character, | |
| title = "{C}harac{T}er: Translation Edit Rate on Character Level", | |
| author = "Wang, Weiyue and | |
| Peter, Jan-Thorsten and | |
| Rosendahl, Hendrik and | |
| Ney, Hermann", | |
| booktitle = "Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers", | |
| month = aug, | |
| year = "2016", | |
| address = "Berlin, Germany", | |
| publisher = "Association for Computational Linguistics", | |
| url = "https://aclanthology.org/W16-2342", | |
| doi = "10.18653/v1/W16-2342", | |
| pages = "505--510", | |
| } | |
| ``` | |
| ## Further References | |
| - Repackaged version that is used in this HF implementation: [https://github.com/bramvanroy/CharacTER](https://github.com/bramvanroy/CharacTER) | |
| - Original version: [https://github.com/rwth-i6/CharacTER](https://github.com/rwth-i6/CharacTER) | |