Update README.md
Browse files
README.md
CHANGED
|
@@ -11,7 +11,7 @@ base_model:
|
|
| 11 |
|
| 12 |
## Model Overview
|
| 13 |
|
| 14 |
-
Mistral-NeMo-Minitron-8B-Instruct is a model for generating responses for various text-generation tasks including roleplaying, retrieval augmented generation, and function calling. It is a fine-tuned version of [nvidia/Mistral-NeMo-Minitron-8B-Base](https://huggingface.co/nvidia/Mistral-NeMo-Minitron-8B-Base), which was pruned and distilled from [Mistral-NeMo 12B](https://huggingface.co/nvidia/Mistral-NeMo-12B-Base) using [our LLM compression technique](https://arxiv.org/abs/2407.14679). The model
|
| 15 |
|
| 16 |
Try this model on [build.nvidia.com](https://build.nvidia.com/nvidia/mistral-nemo-minitron-8b-8k-instruct).
|
| 17 |
|
|
@@ -88,6 +88,20 @@ pipe = pipeline("text-generation", model="nvidia/Mistral-NeMo-Minitron-8B-Instru
|
|
| 88 |
pipe(messages, max_length=64)
|
| 89 |
```
|
| 90 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 91 |
## AI Safety Efforts
|
| 92 |
|
| 93 |
The Mistral-NeMo-Minitron-8B-Instruct model underwent AI safety evaluation including adversarial testing via three distinct methods:
|
|
|
|
| 11 |
|
| 12 |
## Model Overview
|
| 13 |
|
| 14 |
+
Mistral-NeMo-Minitron-8B-Instruct is a model for generating responses for various text-generation tasks including roleplaying, retrieval augmented generation, and function calling. It is a fine-tuned version of [nvidia/Mistral-NeMo-Minitron-8B-Base](https://huggingface.co/nvidia/Mistral-NeMo-Minitron-8B-Base), which was pruned and distilled from [Mistral-NeMo 12B](https://huggingface.co/nvidia/Mistral-NeMo-12B-Base) using [our LLM compression technique](https://arxiv.org/abs/2407.14679). The model was trained using a multi-stage SFT and preference-based alignment technique with [NeMo Aligner](https://github.com/NVIDIA/NeMo-Aligner). For details on the alignment technique, please refer to the [Nemotron-4 340B Technical Report](https://arxiv.org/abs/2406.11704).
|
| 15 |
|
| 16 |
Try this model on [build.nvidia.com](https://build.nvidia.com/nvidia/mistral-nemo-minitron-8b-8k-instruct).
|
| 17 |
|
|
|
|
| 88 |
pipe(messages, max_length=64)
|
| 89 |
```
|
| 90 |
|
| 91 |
+
## Evaluation Results
|
| 92 |
+
|
| 93 |
+
| Category | Benchmark | # Shots | Mistral-NeMo-Minitron-8B-Instruct |
|
| 94 |
+
|-----------------------|-----------------------|---------|-----------------------------------|
|
| 95 |
+
| General | MMLU | 5 | 70.4 |
|
| 96 |
+
| | MT Bench (GPT4-Turbo) | 0 | 7.86 |
|
| 97 |
+
| Math | GMS8K | 0 | 87.1 |
|
| 98 |
+
| Reasoning | GPQA | 0 | 31.5 |
|
| 99 |
+
| Code | HumanEval | 0 | 71.3 |
|
| 100 |
+
| | MBPP | 0 | 72.5 |
|
| 101 |
+
| Instruction Following | IFEval | 0 | 84.4 |
|
| 102 |
+
| Tool Use | BFCL v2 Live | 0 | 67.6 |
|
| 103 |
+
|
| 104 |
+
|
| 105 |
## AI Safety Efforts
|
| 106 |
|
| 107 |
The Mistral-NeMo-Minitron-8B-Instruct model underwent AI safety evaluation including adversarial testing via three distinct methods:
|