nm-testing
/

Mistral-Small-3.1-24B-Instruct-2503-FP8

Image-Text-to-Text

vllm

Model card Files Files and versions

xet

Community

mgoin commited on Apr 17, 2025

Commit

545474d

verified ·

1 Parent(s): a47b840

Update README.md

Browse files

Files changed (1) hide show

README.md +4 -2

README.md CHANGED Viewed

@@ -38,15 +38,17 @@ pipeline_tag: image-text-to-text
 Checkpoint of Mistral-Small-3.1-24B-Instruct-2503 with FP8 per-tensor quantization in the Mistral-format.
 Please run with vLLM like so:
 ```
 vllm serve nm-testing/Mistral-Small-3.1-24B-Instruct-2503-FP8 --tokenizer_mode mistral --config_format mistral --load_format mistral --tool-call-parser mistral --enable-auto-tool-choice --limit_mm_per_prompt 'image=10'
 ```
 Evaluations against the unquantized baseline on ChartQA:
 ```
 vllm serve mistralai/Mistral-Small-3.1-24B-Instruct-2503 --tokenizer_mode mistral --config_format mistral --load_format mistral
 python -m eval.run eval_vllm --model_name mistralai/Mistral-Small-3.1-24B-Instruct-2503 --url http://0.0.0.0:8000 --output_dir output/ --eval_name "chartqa"
-Querying model: 100%|██████████████████████████████████████████████████████████████████████| 2500/2500 [07:37<00:00,  5.47it/s]
 ================================================================================
 Metrics:
 {
@@ -57,7 +59,7 @@ Metrics:
 vllm serve nm-testing/Mistral-Small-3.1-24B-Instruct-2503-FP8 --tokenizer_mode mistral --config_format mistral --load_format mistral
 python -m eval.run eval_vllm --model_name nm-testing/Mistral-Small-3.1-24B-Instruct-2503-FP8 --url http://0.0.0.0:8000 --output_dir output/ --eval_name "chartqa"
-Querying model: 100%|██████████████████████████████████████████████████████████████████████| 2500/2500 [06:37<00:00,  6.28it/s]
 ================================================================================
 Metrics:
 {

 Checkpoint of Mistral-Small-3.1-24B-Instruct-2503 with FP8 per-tensor quantization in the Mistral-format.
 Please run with vLLM like so:
 ```
 vllm serve nm-testing/Mistral-Small-3.1-24B-Instruct-2503-FP8 --tokenizer_mode mistral --config_format mistral --load_format mistral --tool-call-parser mistral --enable-auto-tool-choice --limit_mm_per_prompt 'image=10'
 ```
 Evaluations against the unquantized baseline on ChartQA:
 ```
 vllm serve mistralai/Mistral-Small-3.1-24B-Instruct-2503 --tokenizer_mode mistral --config_format mistral --load_format mistral
 python -m eval.run eval_vllm --model_name mistralai/Mistral-Small-3.1-24B-Instruct-2503 --url http://0.0.0.0:8000 --output_dir output/ --eval_name "chartqa"
+Querying model: 100%|████████████████████████| 2500/2500 [07:37<00:00,  5.47it/s]
 ================================================================================
 Metrics:
 {
 vllm serve nm-testing/Mistral-Small-3.1-24B-Instruct-2503-FP8 --tokenizer_mode mistral --config_format mistral --load_format mistral
 python -m eval.run eval_vllm --model_name nm-testing/Mistral-Small-3.1-24B-Instruct-2503-FP8 --url http://0.0.0.0:8000 --output_dir output/ --eval_name "chartqa"
+Querying model: 100%|█████████████████████████| 2500/2500 [06:37<00:00,  6.28it/s]
 ================================================================================
 Metrics:
 {