nvidia
/

Minitron-4B-Base

Text Generation

Model card Files Files and versions

srvm commited on Jul 23, 2024

Commit

5e7fcb9

·

1 Parent(s): d249aff

Add results preview

Files changed (1) hide show

README.md +23 -0

README.md CHANGED Viewed

@@ -53,6 +53,29 @@ print(output_text)
 Minitron is released under the [NVIDIA Open Model License Agreement](https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf).
 ## Citation
 If you find our work helpful, please consider citing our paper:

 Minitron is released under the [NVIDIA Open Model License Agreement](https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf).
+## Evaluation Results
+*5-shot performance.* Language Understanding evaluated using [Massive Multitask Language Understanding](https://arxiv.org/abs/2009.03300):
+| Average |
+| :---- |
+| 58.6 |
+*Zero-shot performance.* Evaluated using select datasets from the [LM Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) with additions:
+| HellaSwag | Winogrande | GSM8K| ARC-C | XLSum |
+| :------------- | :------------- | :------------- | :------------- | :------------- |
+| 75.0 | 74.0 | 24.1  | 50.9 | 29.5
+*Code generation performance*. Evaluated using [HumanEval](https://github.com/openai/human-eval):
+| p@1, 0-Shot |
+| :------------- |
+| 23.3 |
+Please refer to our [paper](https://arxiv.org/abs/2407.14679) for the full set of results.
 ## Citation
 If you find our work helpful, please consider citing our paper: