Update README.md
Browse files
README.md
CHANGED
|
@@ -38,15 +38,17 @@ pipeline_tag: image-text-to-text
|
|
| 38 |
Checkpoint of Mistral-Small-3.1-24B-Instruct-2503 with FP8 per-tensor quantization in the Mistral-format.
|
| 39 |
|
| 40 |
Please run with vLLM like so:
|
|
|
|
| 41 |
```
|
| 42 |
vllm serve nm-testing/Mistral-Small-3.1-24B-Instruct-2503-FP8 --tokenizer_mode mistral --config_format mistral --load_format mistral --tool-call-parser mistral --enable-auto-tool-choice --limit_mm_per_prompt 'image=10'
|
| 43 |
```
|
| 44 |
|
| 45 |
Evaluations against the unquantized baseline on ChartQA:
|
|
|
|
| 46 |
```
|
| 47 |
vllm serve mistralai/Mistral-Small-3.1-24B-Instruct-2503 --tokenizer_mode mistral --config_format mistral --load_format mistral
|
| 48 |
python -m eval.run eval_vllm --model_name mistralai/Mistral-Small-3.1-24B-Instruct-2503 --url http://0.0.0.0:8000 --output_dir output/ --eval_name "chartqa"
|
| 49 |
-
Querying model: 100
|
| 50 |
================================================================================
|
| 51 |
Metrics:
|
| 52 |
{
|
|
@@ -57,7 +59,7 @@ Metrics:
|
|
| 57 |
|
| 58 |
vllm serve nm-testing/Mistral-Small-3.1-24B-Instruct-2503-FP8 --tokenizer_mode mistral --config_format mistral --load_format mistral
|
| 59 |
python -m eval.run eval_vllm --model_name nm-testing/Mistral-Small-3.1-24B-Instruct-2503-FP8 --url http://0.0.0.0:8000 --output_dir output/ --eval_name "chartqa"
|
| 60 |
-
Querying model: 100
|
| 61 |
================================================================================
|
| 62 |
Metrics:
|
| 63 |
{
|
|
|
|
| 38 |
Checkpoint of Mistral-Small-3.1-24B-Instruct-2503 with FP8 per-tensor quantization in the Mistral-format.
|
| 39 |
|
| 40 |
Please run with vLLM like so:
|
| 41 |
+
|
| 42 |
```
|
| 43 |
vllm serve nm-testing/Mistral-Small-3.1-24B-Instruct-2503-FP8 --tokenizer_mode mistral --config_format mistral --load_format mistral --tool-call-parser mistral --enable-auto-tool-choice --limit_mm_per_prompt 'image=10'
|
| 44 |
```
|
| 45 |
|
| 46 |
Evaluations against the unquantized baseline on ChartQA:
|
| 47 |
+
|
| 48 |
```
|
| 49 |
vllm serve mistralai/Mistral-Small-3.1-24B-Instruct-2503 --tokenizer_mode mistral --config_format mistral --load_format mistral
|
| 50 |
python -m eval.run eval_vllm --model_name mistralai/Mistral-Small-3.1-24B-Instruct-2503 --url http://0.0.0.0:8000 --output_dir output/ --eval_name "chartqa"
|
| 51 |
+
Querying model: 100%|ββββββββββββββββββββββββ| 2500/2500 [07:37<00:00, 5.47it/s]
|
| 52 |
================================================================================
|
| 53 |
Metrics:
|
| 54 |
{
|
|
|
|
| 59 |
|
| 60 |
vllm serve nm-testing/Mistral-Small-3.1-24B-Instruct-2503-FP8 --tokenizer_mode mistral --config_format mistral --load_format mistral
|
| 61 |
python -m eval.run eval_vllm --model_name nm-testing/Mistral-Small-3.1-24B-Instruct-2503-FP8 --url http://0.0.0.0:8000 --output_dir output/ --eval_name "chartqa"
|
| 62 |
+
Querying model: 100%|βββββββββββββββββββββββββ| 2500/2500 [06:37<00:00, 6.28it/s]
|
| 63 |
================================================================================
|
| 64 |
Metrics:
|
| 65 |
{
|