Image-Text-to-Text
Transformers
Safetensors
minimax_m3_vl
multimodal
Mixture of Experts
agent
coding
video
conversational
custom_code
Instructions to use unsloth/MiniMax-M3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use unsloth/MiniMax-M3 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="unsloth/MiniMax-M3", trust_remote_code=True) messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForMultimodalLM processor = AutoProcessor.from_pretrained("unsloth/MiniMax-M3", trust_remote_code=True) model = AutoModelForMultimodalLM.from_pretrained("unsloth/MiniMax-M3", trust_remote_code=True) messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use unsloth/MiniMax-M3 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "unsloth/MiniMax-M3" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "unsloth/MiniMax-M3", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/unsloth/MiniMax-M3
- SGLang
How to use unsloth/MiniMax-M3 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "unsloth/MiniMax-M3" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "unsloth/MiniMax-M3", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "unsloth/MiniMax-M3" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "unsloth/MiniMax-M3", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use unsloth/MiniMax-M3 with Docker Model Runner:
docker model run hf.co/unsloth/MiniMax-M3
File size: 4,609 Bytes
852b1dd 4f4de81 852b1dd 4f4de81 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 | ---
pipeline_tag: image-text-to-text
license: other
license_name: minimax-community
license_link: LICENSE
library_name: transformers
tags:
- multimodal
- moe
- agent
- coding
- video
base_model:
- MiniMaxAI/MiniMax-M3
---
<div align="center">
<img width="60%" src="figures/logo.svg" alt="MiniMax">
</div>
<hr>
<div align="center" style="line-height: 1.4; font-size:16px; margin-top: 30px;">
Join Our
<a href="https://platform.minimaxi.com/docs/faq/contact-us" target="_blank" style="font-size:17px; margin: 2px;">
💬 WeChat
</a> |
<a href="https://discord.com/invite/DPC4AHFCBw" target="_blank" style="font-size:17px; margin: 2px;">
🧩 Discord
</a>
community.
</div>
<div align="center" style="line-height: 1.2; font-size:16px;">
<a href="https://agent.minimax.io/" target="_blank" style="display: inline-block; margin: 4px;">
MiniMax Agent
</a> |
<a href="https://platform.minimax.io/docs/guides/text-generation" target="_blank" style="display: inline-block; margin: 4px;">
⚡️ API
</a> |
<a href="https://github.com/MiniMax-AI/cli" style="display: inline-block; margin: 4px;">
CLI
</a> |
<a href="https://www.minimax.io" target="_blank" style="display: inline-block; margin: 4px;">
MiniMax Website
</a>
</div>
<div align="center" style="line-height: 1.2; font-size:16px; margin-bottom: 30px;">
<a href="https://huggingface.co/MiniMaxAI" target="_blank" style="margin: 2px;">
🤗 Hugging Face
</a> |
<a href="https://github.com/MiniMax-AI/MiniMax-M3" target="_blank" style="margin: 2px;">
🐙 GitHub
</a> |
<a href="https://www.modelscope.cn/organization/MiniMax" target="_blank" style="margin: 2px;">
🤖️ ModelScope
</a> |
<a href="https://huggingface.co/MiniMaxAI/MiniMax-M3/blob/main/LICENSE" style="margin: 2px;">
📄 LICENSE
</a>
</div>
MiniMax-M3 is a native multimodal model with 1M context. It has ~428B parameters and ~23B activated parameters.
**Highlights:**
- **Native Multimodality:** M3 undergoes mixed-modality training from the very first step, enabling deeper semantic fusion across text, image, and video.
- **Context Scaling via Sparse Attention:** M3 introduces MiniMax Sparse Attention (MSA) to improve long context efficiency. M3 delivers 9× prefill and 15× decode speedups compared to M2 at 1M context, reducing per-token compute to 1/20.
- **Coding & Cowork Capability:** M3 achieves frontier-level performance across long-horizon agentic benchmarks, excelling in both coding and cowork.
## Model Details
| | |
| --- | --- |
| Architecture | MoE + MSA (MiniMax Sparse Attention) |
| Total Parameters | ~428B |
| Activated Parameters | ~23B |
| Experts | 128 (4 active per token) |
| Layers | 60 |
| Context Length | 1M tokens |
| Modalities | Text, Image, Video |
| Precision | bfloat16 |
| Transformers | ≥ 4.52.4 (`trust_remote_code=True`) |
| License | [MiniMax Community License](LICENSE) |
<p align="center">
<img width="100%" src="figures/benchmark.jpeg">
</p>
## How to Use
- [MiniMax Agent](https://agent.minimax.io/)
- [MiniMax API](https://platform.minimax.io/)
M3 supports two reasoning modes:
- **thinking** — for complex reasoning, agentic tasks, and long-horizon collaboration.
- **non-thinking** — for latency-sensitive scenarios such as chat and code completion.
## Local Deployment
Download the model:
```bash
hf download MiniMaxAI/MiniMax-M3 --local-dir MiniMax-M3
```
We recommend the following inference frameworks (listed alphabetically) to serve the model:
### SGLang
We recommend using [SGLang](https://docs.sglang.io/) to serve MiniMax-M3. Please refer to our [SGLang Deployment Guide](./docs/sglang_deploy_guide.md).
### vLLM
We recommend using [vLLM](https://github.com/vllm-project/vllm) to serve MiniMax-M3. Please refer to our [vLLM Deployment Guide](./docs/vllm_deploy_guide.md).
### Transformers
We recommend using [Transformers](https://github.com/huggingface/transformers) to serve MiniMax-M3. Please refer to our [Transformers Deployment Guide](./docs/transformers_deploy_guide.md).
### ModelScope
You can also get model weights from [ModelScope](https://modelscope.cn/models/MiniMax/MiniMax-M3).
### Inference Parameters
We recommend the following parameters for best performance: `temperature=1.0`, `top_p=0.95`, `top_k=40`. Default system prompt:
```
You are a helpful assistant. Your name is MiniMax-M3 and was built by MiniMax.
```
## Tool Calling Guide
Please refer to our [Tool Calling Guide](./docs/tool_calling_guide.md).
## Contact Us
Contact us at [model@minimax.io](mailto:model@minimax.io). |