Spaces:

evaluate-metric
/

rouge

Runtime error

Add disclaimer for ROUGE being non-deterministic when using the aggregator and suggest a solution

by Eran - opened Jun 4, 2023

←

Files changed (1) hide show

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ description: >-
   evaluating automatic summarization and machine translation software in natural language processing.
   The metrics compare an automatically produced summary or translation against a reference or a set of references (human-produced) summary or translation.
-  Note that ROUGE is case insensitive, meaning that upper case letters are treated the same way as lower case letters.
   This metrics is a wrapper around Google Research reimplementation of ROUGE:
   https://github.com/google-research/google-research/tree/master/rouge
@@ -62,6 +62,11 @@ It can also deal with lists of references for each predictions:
 {'rouge1': 0.8333, 'rouge2': 0.5, 'rougeL': 0.8333, 'rougeLsum': 0.8333}```
 ```
 ### Inputs
 - **predictions** (`list`): list of predictions to score. Each prediction
         should be a string with tokens separated by spaces.

   evaluating automatic summarization and machine translation software in natural language processing.
   The metrics compare an automatically produced summary or translation against a reference or a set of references (human-produced) summary or translation.
+  Note that ROUGE is case insensitive, meaning that upper case letters are treated the same way as lower case letters. Also, note that by default, ROUGE uses a random resampling algorithm which is non-deterministic.
   This metrics is a wrapper around Google Research reimplementation of ROUGE:
   https://github.com/google-research/google-research/tree/master/rouge
 {'rouge1': 0.8333, 'rouge2': 0.5, 'rougeL': 0.8333, 'rougeLsum': 0.8333}```
 ```
+You can input the `load` with a seed to initialize the random number generator and fix ROUGE predictions from changing between different runs:
+```python
+>>> rouge = evaluate.load('rouge', seed=42)
+```
 ### Inputs
 - **predictions** (`list`): list of predictions to score. Each prediction
         should be a string with tokens separated by spaces.