add results
Browse files
README.md
CHANGED
|
@@ -24,7 +24,7 @@ tags:
|
|
| 24 |
---
|
| 25 |
|
| 26 |
# Ministral 3 14B Instruct 2512
|
| 27 |
-
The largest model in the Ministral 3 family, **Ministral 3 14B** offers frontier capabilities and performance comparable to its larger [Mistral Small 3.2 24B](https://huggingface.co/mistralai/
|
| 28 |
|
| 29 |
This model is the instruct post-trained version in **FP8**, fine-tuned for instruction tasks, making it ideal for chat and instruction based use cases.
|
| 30 |
|
|
@@ -527,7 +527,112 @@ model = Mistral3ForConditionalGeneration.from_pretrained(
|
|
| 527 |
quantization_config=FineGrainedFP8Config(dequantize=True)
|
| 528 |
)
|
| 529 |
```
|
|
|
|
| 530 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 531 |
## License
|
| 532 |
|
| 533 |
This model is licensed under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0.txt).
|
|
|
|
| 24 |
---
|
| 25 |
|
| 26 |
# Ministral 3 14B Instruct 2512
|
| 27 |
+
The largest model in the Ministral 3 family, **Ministral 3 14B** offers frontier capabilities and performance comparable to its larger [Mistral Small 3.2 24B](https://huggingface.co/mistralai/ Ministral-3-14B-Instruct-2512) counterpart. A powerful and efficient language model with vision capabilities.
|
| 28 |
|
| 29 |
This model is the instruct post-trained version in **FP8**, fine-tuned for instruction tasks, making it ideal for chat and instruction based use cases.
|
| 30 |
|
|
|
|
| 527 |
quantization_config=FineGrainedFP8Config(dequantize=True)
|
| 528 |
)
|
| 529 |
```
|
| 530 |
+
## Red Hat AI Evaluations
|
| 531 |
|
| 532 |
+
As part of the model validation effort, Red Hat conducted independent accuracy evaluations and the results are presented below.
|
| 533 |
+
The model was evaluated with [vLLM](https://vllm.ai/) version 0.11.2 and either [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) or
|
| 534 |
+
[lighteval](https://github.com/huggingface/lighteval) depending on the benchmark.
|
| 535 |
+
|
| 536 |
+
<details>
|
| 537 |
+
<summary>Evaluation commands</summary>
|
| 538 |
+
|
| 539 |
+
All evaluations were conducted using the vLLM server interface.
|
| 540 |
+
The server is first initialized with the following command on 1 H200 GPUs:
|
| 541 |
+
```bash
|
| 542 |
+
vllm serve RedHatAI/Ministral-3-14B-Instruct-2512 \
|
| 543 |
+
--max-model-len 262144 \
|
| 544 |
+
--tokenizer_mode mistral \
|
| 545 |
+
--config_format mistral \
|
| 546 |
+
--load_format mistral \
|
| 547 |
+
--limit-mm-per-prompt '{"image": 10}'
|
| 548 |
+
```
|
| 549 |
+
|
| 550 |
+
MMLU-Pro, IFEval and MMMU were evaluated using [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) as follows.
|
| 551 |
+
```bash
|
| 552 |
+
lm_eval \
|
| 553 |
+
--model local-chat-completions \
|
| 554 |
+
--tasks mmlu_pro,ifeval,mmmu_val \
|
| 555 |
+
--model_args "model=RedHatAI/Ministral-3-14B-Instruct-2512,max_length=64000,base_url=http://0.0.0.0:8000/v1/chat/completions,num_concurrent=64,max_retries=3,tokenized_requests=False,tokenizer_backend=None,timeout=1200,max_images=10" \
|
| 556 |
+
--apply_chat_template \
|
| 557 |
+
--fewshot_as_multiturn \
|
| 558 |
+
--output_path results_lmeval_ministral \
|
| 559 |
+
--gen_kwargs "do_sample=True,temperature=0.15"
|
| 560 |
+
```
|
| 561 |
+
|
| 562 |
+
AIME25, GPQA Diamond and Math 500 were evaluated using [lighteval](https://github.com/huggingface/lighteval) as follows.
|
| 563 |
+
|
| 564 |
+
litellm_config.yaml
|
| 565 |
+
```yaml
|
| 566 |
+
model_parameters:
|
| 567 |
+
provider: "hosted_vllm"
|
| 568 |
+
model_name: "hosted_vllm/RedHatAI/Ministral-3-14B-Instruct-2512"
|
| 569 |
+
base_url: "http://0.0.0.0:8000/v1"
|
| 570 |
+
api_key: ""
|
| 571 |
+
timeout: 1200
|
| 572 |
+
concurrent_requests: 64
|
| 573 |
+
max_model_length: 262144
|
| 574 |
+
generation_parameters:
|
| 575 |
+
temperature: 0.15
|
| 576 |
+
max_new_tokens: 64000
|
| 577 |
+
```
|
| 578 |
+
|
| 579 |
+
```bash
|
| 580 |
+
lighteval endpoint litellm litellm_config.yaml \
|
| 581 |
+
"aime25|0,math_500|0,gpqa:diamond|0" \
|
| 582 |
+
--output-dir results_lighteval_ministral \
|
| 583 |
+
--save-details
|
| 584 |
+
```
|
| 585 |
+
|
| 586 |
+
</details>
|
| 587 |
+
|
| 588 |
+
<table>
|
| 589 |
+
<thead>
|
| 590 |
+
<tr>
|
| 591 |
+
<th>Benchmark</th>
|
| 592 |
+
<th>RedHatAI/Ministral-3-14B-Instruct-2512-BF16</th>
|
| 593 |
+
<th>RedHatAI/Ministral-3-14B-Instruct-2512</th>
|
| 594 |
+
<th>Recovery</th>
|
| 595 |
+
</tr>
|
| 596 |
+
</thead>
|
| 597 |
+
<tbody>
|
| 598 |
+
<tr>
|
| 599 |
+
<td>MMLU-Pro</td>
|
| 600 |
+
<td>41.69</td>
|
| 601 |
+
<td>45.87</td>
|
| 602 |
+
<td>110.0%</td>
|
| 603 |
+
</tr>
|
| 604 |
+
<tr>
|
| 605 |
+
<td>IFEval</td>
|
| 606 |
+
<td>77.34</td>
|
| 607 |
+
<td>76.86</td>
|
| 608 |
+
<td>99.38%</td>
|
| 609 |
+
</tr>
|
| 610 |
+
<tr>
|
| 611 |
+
<td>MMMU</td>
|
| 612 |
+
<td>55.33</td>
|
| 613 |
+
<td>55.33</td>
|
| 614 |
+
<td>100.0%</td>
|
| 615 |
+
</tr>
|
| 616 |
+
<tr>
|
| 617 |
+
<td>AIME25</td>
|
| 618 |
+
<td>36.67</td>
|
| 619 |
+
<td>36.67</td>
|
| 620 |
+
<td>100.0%</td>
|
| 621 |
+
</tr>
|
| 622 |
+
<tr>
|
| 623 |
+
<td>GPQA Diamond</td>
|
| 624 |
+
<td>58.59</td>
|
| 625 |
+
<td>58.59</td>
|
| 626 |
+
<td>100.0%</td>
|
| 627 |
+
</tr>
|
| 628 |
+
<tr>
|
| 629 |
+
<td>MATH 500</td>
|
| 630 |
+
<td>88.6</td>
|
| 631 |
+
<td>86.2</td>
|
| 632 |
+
<td>97.29%</td>
|
| 633 |
+
</tr>
|
| 634 |
+
</tbody>
|
| 635 |
+
</table>
|
| 636 |
## License
|
| 637 |
|
| 638 |
This model is licensed under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0.txt).
|