Image-Text-to-Text
Transformers
Safetensors
cohere2_vision
conversational
chat
8-bit precision
compressed-tensors
Instructions to use CohereLabs/command-a-plus-05-2026-w4a4 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use CohereLabs/command-a-plus-05-2026-w4a4 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="CohereLabs/command-a-plus-05-2026-w4a4") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("CohereLabs/command-a-plus-05-2026-w4a4") model = AutoModelForImageTextToText.from_pretrained("CohereLabs/command-a-plus-05-2026-w4a4") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use CohereLabs/command-a-plus-05-2026-w4a4 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "CohereLabs/command-a-plus-05-2026-w4a4" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "CohereLabs/command-a-plus-05-2026-w4a4", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/CohereLabs/command-a-plus-05-2026-w4a4
- SGLang
How to use CohereLabs/command-a-plus-05-2026-w4a4 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "CohereLabs/command-a-plus-05-2026-w4a4" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "CohereLabs/command-a-plus-05-2026-w4a4", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "CohereLabs/command-a-plus-05-2026-w4a4" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "CohereLabs/command-a-plus-05-2026-w4a4", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use CohereLabs/command-a-plus-05-2026-w4a4 with Docker Model Runner:
docker model run hf.co/CohereLabs/command-a-plus-05-2026-w4a4
Chat Completion "reasoning" support
#2
by yzong-rh - opened
- chat_template.jinja +5 -3
chat_template.jinja
CHANGED
|
@@ -125,8 +125,10 @@ These instructions serve as your defaults, but they can be overridden in subsequ
|
|
| 125 |
{%- endif %}
|
| 126 |
{%- endmacro %}
|
| 127 |
{%- macro print_thinking(msg) %}
|
| 128 |
-
{%- if msg.thinking -%}
|
| 129 |
{{ msg.thinking }}
|
|
|
|
|
|
|
| 130 |
{%- elif msg.content and msg.content[0].thinking -%}
|
| 131 |
{{ msg.content[0].thinking }}
|
| 132 |
{%- endif %}
|
|
@@ -224,7 +226,7 @@ Your output should adhere to the following json schema:
|
|
| 224 |
{% if not skip_thinking %}
|
| 225 |
{% if message.tool_plan -%}
|
| 226 |
<|START_THINKING|>{{ message.tool_plan }}<|END_THINKING|>
|
| 227 |
-
{%- elif message.thinking or (message.content and message.content[0].type == "thinking") -%}
|
| 228 |
<|START_THINKING|>{{ print_thinking(message) }}<|END_THINKING|>
|
| 229 |
{%- endif %}
|
| 230 |
{%- endif %}<|START_ACTION|>[
|
|
@@ -236,7 +238,7 @@ Your output should adhere to the following json schema:
|
|
| 236 |
|
| 237 |
]<|END_ACTION|><|END_OF_TURN_TOKEN|>
|
| 238 |
{%- else -%}
|
| 239 |
-
{% if (message.thinking or (message.content and message.content[0].type == "thinking")) and not skip_thinking -%}
|
| 240 |
<|START_THINKING|>{{ print_thinking(message) }}<|END_THINKING|>
|
| 241 |
{%- endif -%}
|
| 242 |
{{ print_msg(message) }}<|END_OF_TURN_TOKEN|>
|
|
|
|
| 125 |
{%- endif %}
|
| 126 |
{%- endmacro %}
|
| 127 |
{%- macro print_thinking(msg) %}
|
| 128 |
+
{%- if msg.thinking is defined and msg.thinking is not none -%}
|
| 129 |
{{ msg.thinking }}
|
| 130 |
+
{%- elif msg.reasoning is defined and msg.reasoning is not none -%}
|
| 131 |
+
{{ msg.reasoning }}
|
| 132 |
{%- elif msg.content and msg.content[0].thinking -%}
|
| 133 |
{{ msg.content[0].thinking }}
|
| 134 |
{%- endif %}
|
|
|
|
| 226 |
{% if not skip_thinking %}
|
| 227 |
{% if message.tool_plan -%}
|
| 228 |
<|START_THINKING|>{{ message.tool_plan }}<|END_THINKING|>
|
| 229 |
+
{%- elif (message.thinking is defined and message.thinking is not none) or (message.reasoning is defined and message.reasoning is not none) or (message.content and message.content[0].type == "thinking") -%}
|
| 230 |
<|START_THINKING|>{{ print_thinking(message) }}<|END_THINKING|>
|
| 231 |
{%- endif %}
|
| 232 |
{%- endif %}<|START_ACTION|>[
|
|
|
|
| 238 |
|
| 239 |
]<|END_ACTION|><|END_OF_TURN_TOKEN|>
|
| 240 |
{%- else -%}
|
| 241 |
+
{% if ((message.thinking is defined and message.thinking is not none) or (message.reasoning is defined and message.reasoning is not none) or (message.content and message.content[0].type == "thinking")) and not skip_thinking -%}
|
| 242 |
<|START_THINKING|>{{ print_thinking(message) }}<|END_THINKING|>
|
| 243 |
{%- endif -%}
|
| 244 |
{{ print_msg(message) }}<|END_OF_TURN_TOKEN|>
|