metadata
base_model:
- CohereLabs/c4ai-command-a-03-2025
gghfez/command-a-03-2025-AWQ
Tested with vllm==0.10.1
Usage:
vllm serve gghfez/command-a-03-2025-AWQ --port 8080 --host 0.0.0.0 --dtype bfloat16 --max-model-len 32768 -tp 4 --gpu-memory-utilization 0.9