Update README.md
Browse files
README.md
CHANGED
|
@@ -45,7 +45,7 @@ pip install qa-metrics
|
|
| 45 |
|
| 46 |
#### Method: `compute_score`
|
| 47 |
**Parameters**
|
| 48 |
-
- `reference_answer` (
|
| 49 |
- `candidate_answer` (str): The answer provided by a candidate that needs to be evaluated
|
| 50 |
|
| 51 |
**Returns**
|
|
@@ -61,6 +61,25 @@ rb.compute_score(reference_answer, candidate_answer)
|
|
| 61 |
# (0.29113227128982544, 2.1645290851593018)
|
| 62 |
```
|
| 63 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 64 |
## Acknowledgements
|
| 65 |
|
| 66 |
We sincerely appreciate the contributions of the open-source community. The related projects are as follows: [R1-V](https://github.com/Deep-Agent/R1-V) , [DeepSeek-R1](https://github.com/deepseek-ai/DeepSeek-R1) , [Video-R1](https://github.com/tulerfeng/Video-R1), [Qwen-2.5-VL](https://arxiv.org/abs/2502.13923)
|
|
|
|
| 45 |
|
| 46 |
#### Method: `compute_score`
|
| 47 |
**Parameters**
|
| 48 |
+
- `reference_answer` (str): gold (correct) answer to the question
|
| 49 |
- `candidate_answer` (str): The answer provided by a candidate that needs to be evaluated
|
| 50 |
|
| 51 |
**Returns**
|
|
|
|
| 61 |
# (0.29113227128982544, 2.1645290851593018)
|
| 62 |
```
|
| 63 |
|
| 64 |
+
|
| 65 |
+
#### Method: `compute_batch_scores`
|
| 66 |
+
**Parameters**
|
| 67 |
+
- `reference_answers` (list of str): A list of gold (correct) answers to the question
|
| 68 |
+
- `candidate_answer` (list of str): A list of answers provided by a candidate that needs to be evaluated
|
| 69 |
+
- `batch_size` (int): batch size to predict (default 1)
|
| 70 |
+
|
| 71 |
+
**Returns**
|
| 72 |
+
- `tuple`: A tuple of a list of normalized and raw scores.
|
| 73 |
+
|
| 74 |
+
```python
|
| 75 |
+
from qa_metrics.RewardBert import RewardBert
|
| 76 |
+
|
| 77 |
+
rb = RewardBert(device='cuda')
|
| 78 |
+
reference_answer = ["The Frog Prince"]
|
| 79 |
+
candidate_answer = ["The movie \"The Princess and the Frog\" is loosely based off the Brother Grimm's \"Iron Henry\""]
|
| 80 |
+
rb.compute_batch_scores(reference_answer, candidate_answer, batch_size=1)
|
| 81 |
+
# ([0.29113227128982544], [2.1645290851593018])
|
| 82 |
+
|
| 83 |
## Acknowledgements
|
| 84 |
|
| 85 |
We sincerely appreciate the contributions of the open-source community. The related projects are as follows: [R1-V](https://github.com/Deep-Agent/R1-V) , [DeepSeek-R1](https://github.com/deepseek-ai/DeepSeek-R1) , [Video-R1](https://github.com/tulerfeng/Video-R1), [Qwen-2.5-VL](https://arxiv.org/abs/2502.13923)
|