Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -76,7 +76,7 @@ The 13B model is trained on 32 A100 GPUs. The learning rate (LR) is controlled b
|
|
| 76 |
|
| 77 |
- **Other Popular Benchmarks**: We report the average accuracies on GSM8K (8-shot), MMLU (5-shot), Big Bench Hard (BBH) (3-shot), and AGI-Eval (0-shot). Refer to Appendix~\ref{sec:eval-details} for more details.
|
| 78 |
|
| 79 |
-
|
| 80 |
|
| 81 |
### Evaluation Results
|
| 82 |
|
|
|
|
| 76 |
|
| 77 |
- **Other Popular Benchmarks**: We report the average accuracies on GSM8K (8-shot), MMLU (5-shot), Big Bench Hard (BBH) (3-shot), and AGI-Eval (0-shot). Refer to Appendix~\ref{sec:eval-details} for more details.
|
| 78 |
|
| 79 |
+
**Notes**: For PIQA, SIQA, HellaSwag, WinoGrande, COPA, BoolQ, LAMBADA, TyDi QA, and AGI-Eval, we obtain the predicted answers based on maximized perplexity. For GSM8K, MMLU, and BBH, the predicted answers are directly generated.
|
| 80 |
|
| 81 |
### Evaluation Results
|
| 82 |
|