RedHatAI
/

pixtral-12b-quantized.w4a16

compressed-tensors

Model card Files Files and versions

shubhrapandit commited on Feb 24, 2025

Commit

50a2b47

·

verified ·

1 Parent(s): 696a910

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -160,7 +160,6 @@ The model was evaluated on OpenLLM Leaderboard [V1](https://huggingface.co/space
 <summary>Evaluation Commands</summary>
 ```
-guidellm --model neuralmagic/granite-3.1-8b-instruct-quantized.w4a16 --target "http://localhost:8000/v1" --data-type emulated --data "prompt_tokens=<prompt_tokens>,generated_tokens=<generated_tokens>" --max seconds 360 --backend aiohttp_server
 ```
 </details>
@@ -175,6 +174,7 @@ The following performance benchmarks were conducted with [vLLM](https://docs.vll
 <details>
 <summary>Benchmarking Command</summary>
 </details>

 <summary>Evaluation Commands</summary>
 ```
 ```
 </details>
 <details>
 <summary>Benchmarking Command</summary>
+guidellm --model neuralmagic/pixtral-12b-quantized.w4a16 --target "http://localhost:8000/v1" --data-type emulated --data prompt_tokens=128,generated_tokens=128,images=1,width=640,height=480 --max seconds 120 --backend aiohttp_server
 </details>