Instructions to use MiniMaxAI/MiniMax-M1-40k with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use MiniMaxAI/MiniMax-M1-40k with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="MiniMaxAI/MiniMax-M1-40k", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("MiniMaxAI/MiniMax-M1-40k", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use MiniMaxAI/MiniMax-M1-40k with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "MiniMaxAI/MiniMax-M1-40k"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "MiniMaxAI/MiniMax-M1-40k",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/MiniMaxAI/MiniMax-M1-40k

SGLang

How to use MiniMaxAI/MiniMax-M1-40k with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "MiniMaxAI/MiniMax-M1-40k" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "MiniMaxAI/MiniMax-M1-40k",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "MiniMaxAI/MiniMax-M1-40k" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "MiniMaxAI/MiniMax-M1-40k",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use MiniMaxAI/MiniMax-M1-40k with Docker Model Runner:
```
docker model run hf.co/MiniMaxAI/MiniMax-M1-40k
```

realolipop commited on Jun 16, 2025

Commit

7248a14

verified ·

1 Parent(s): 32c3b05

Update README.md

Browse files

Files changed (1) hide show

README.md +19 -52

README.md CHANGED Viewed

@@ -95,61 +95,28 @@ foundation for next-generation language model agents to reason and tackle real-w
 ## 2. Evaluation
-<!-- **Performance of MiniMax-M1 on core benchmarks.**
-| **Tasks** | **OpenAI-o3** | **Gemini 2.5<br>Pro (06-05)** | **Claude<br>4 Opus** | **Seed-<br>Thinking-<br>v1.5** | **DeepSeek-<br>R1** | **DeepSeek-<br>R1-0528** | **Qwen3-<br>235B-A22B** | **MiniMax-<br>M1-40K** | **MiniMax-<br>M1-80K** |
-|:---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
-| *Extended<br>Thinking* | *100k* | *64k* | *64k* | *32k* | *32k* | *64k* | *32k* | *40K* | *80K* |
-| ***Mathematics*** |
-| AIME 2024 | 91.6 | 92.0 | 76.0 | 86.7 | 79.8 | 91.4 | 85.7 | 83.3 | 86.0 |
-| AIME 2025 | 88.9 | 88.0 | 75.5 | 74.0 | 70.0 | 87.5 | 81.5 | 74.6 | 76.9 |
-| MATH-500 | 98.1 | 98.8 | 98.2 | 96.7 | 97.3 | 98.0 | 96.2 | 96.0 | 96.8 |
-| ***General Coding*** |
-| LiveCodeBench<br>*(24/8~25/5)* | 75.8 | 77.1 | 56.6 | 67.5 | 55.9 | 73.1 | 65.9 | 62.3 | 65.0 |
-| FullStackBench | 69.3 | -- | 70.3 | 69.9 | 70.1 | 69.4 | 62.9 | 67.6 | 68.3 |
-| ***Reasoning & Knowledge*** |
-| GPQA Diamond | 83.3 | 86.4 | 79.6 | 77.3 | 71.5 | 81.0 | 71.1 | 69.2 | 70.0 |
-| HLE *(no tools)* | 20.3 | 21.6 | 10.7 | 8.2 | 8.6\* | 17.7\* | 7.6\* | 7.2\* | 8.4\* |
-| ZebraLogic | 95.8 | 91.6 | 95.1 | 84.4 | 78.7 | 95.1 | 80.3 | 80.1 | 86.8 |
-| MMLU-Pro | 85.0 | 86.0 | 85.0 | 87.0 | 84.0 | 85.0 | 83.0 | 80.6 | 81.1 |
-| ***Software Engineering*** |
-| SWE-bench Verified| 69.1 | 67.2 | 72.5 | 47.0 | 49.2 | 57.6 | 34.4 | 55.6 | 56.0 |
-| ***Long Context*** |
-| OpenAI-MRCR *(128k)* | 56.5 | 76.8 | 48.9 | 54.3 | 35.8 | 51.5 | 27.7 | 76.1 | 73.4 |
-| OpenAI-MRCR *(1M)* | -- | 58.8 | -- | -- | -- | -- | -- | 58.6 | 56.2 |
-| LongBench-v2 | 58.8 | 65.0 | 55.6 | 52.5 | 58.3 | 52.1 | 50.1 | 61.0 | 61.5 |
-| ***Agentic Tool Use*** |
-| TAU-bench *(airline)* | 52.0 | 50.0 | 59.6 | 44.0 | -- | 53.5 | 34.7 | 60.0 | 62.0 |
-| TAU-bench *(retail)* | 73.9 | 67.0 | 81.4 | 55.7 | -- | 63.9 | 58.6 | 67.8 | 63.5 |
-| ***Factuality*** |
-| SimpleQA | 49.4 | 54.0 | -- | 12.9 | 30.1 | 27.8 | 11.0 | 17.9 | 18.5 |
-| ***General Assistant*** |
-| MultiChallenge | 56.5 | 51.8 | 45.8 | 43.0 | 40.7 | 45.0 | 40.0 | 44.7 | 44.7 |
-\* conducted on the text-only HLE subset. -->
 **Performance of MiniMax-M1 on core benchmarks.**
-| **Category** | **Task** | **OpenAI-o3** | **Gemini 2.5 Pro (06-05)** | **Claude 4 Opus** | **Seed-Thinking-v1.5** | **DeepSeek-R1** | **DeepSeek-R1-0528** | **Qwen3-235B-A22B** | **MiniMax-M1-40K** | **MiniMax-M1-80K** |
 |:---|:---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
-| | *Extended Thinking* | *100k* | *64k* | *64k* | *32k* | *32k* | *64k* | *32k* | *40K* | *80K* |
-| ***Mathematics*** | AIME 2024 | 91.6 | 92.0 | 76.0 | 86.7 | 79.8 | 91.4 | 85.7 | 83.3 | 86.0 |
-| | AIME 2025 | 88.9 | 88.0 | 75.5 | 74.0 | 70.0 | 87.5 | 81.5 | 74.6 | 76.9 |
-| | MATH-500 | 98.1 | 98.8 | 98.2 | 96.7 | 97.3 | 98.0 | 96.2 | 96.0 | 96.8 |
-| ***General Coding*** | LiveCodeBench *(24/8~25/5)* | 75.8 | 77.1 | 56.6 | 67.5 | 55.9 | 73.1 | 65.9 | 62.3 | 65.0 |
-| | FullStackBench | 69.3 | -- | 70.3 | 69.9 | 70.1 | 69.4 | 62.9 | 67.6 | 68.3 |
-| ***Reasoning & Knowledge***| GPQA Diamond | 83.3 | 86.4 | 79.6 | 77.3 | 71.5 | 81.0 | 71.1 | 69.2 | 70.0 |
-| | HLE *(no tools)* | 20.3 | 21.6 | 10.7 | 8.2 | 8.6\* | 17.7\* | 7.6\* | 7.2\* | 8.4\* |
-| | ZebraLogic | 95.8 | 91.6 | 95.1 | 84.4 | 78.7 | 95.1 | 80.3 | 80.1 | 86.8 |
-| | MMLU-Pro | 85.0 | 86.0 | 85.0 | 87.0 | 84.0 | 85.0 | 83.0 | 80.6 | 81.1 |
-| ***Software Engineering***| SWE-bench Verified| 69.1 | 67.2 | 72.5 | 47.0 | 49.2 | 57.6 | 34.4 | 55.6 | 56.0 |
-| ***Long Context*** | OpenAI-MRCR *(128k)* | 56.5 | 76.8 | 48.9 | 54.3 | 35.8 | 51.5 | 27.7 | 76.1 | 73.4 |
-| | OpenAI-MRCR *(1M)* | -- | 58.8 | -- | -- | -- | -- | -- | 58.6 | 56.2 |
-| | LongBench-v2 | 58.8 | 65.0 | 55.6 | 52.5 | 58.3 | 52.1 | 50.1 | 61.0 | 61.5 |
-| ***Agentic Tool Use***| TAU-bench *(airline)* | 52.0 | 50.0 | 59.6 | 44.0 | -- | 53.5 | 34.7 | 60.0 | 62.0 |
-| | TAU-bench *(retail)* | 73.9 | 67.0 | 81.4 | 55.7 | -- | 63.9 | 58.6 | 67.8 | 63.5 |
-| ***Factuality*** | SimpleQA | 49.4 | 54.0 | -- | 12.9 | 30.1 | 27.8 | 11.0 | 17.9 | 18.5 |
-| ***General Assistant***| MultiChallenge | 56.5 | 51.8 | 45.8 | 43.0 | 40.7 | 45.0 | 40.0 | 44.7 | 44.7 |
 \* conducted on the text-only HLE subset.

 ## 2. Evaluation
 **Performance of MiniMax-M1 on core benchmarks.**
+| **Category** | **Task** | **MiniMax-M1-40K** | **MiniMax-M1-80K** | **OpenAI-o3** | **Gemini 2.5 Pro (06-05)** | **Claude 4 Opus** | **Seed-Thinking-v1.5** | **DeepSeek-R1** | **DeepSeek-R1-0528** | **Qwen3-235B-A22B** |
 |:---|:---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
+| | *Extended Thinking* | *40K* | *80K* | *100k* | *64k* | *64k* | *32k* | *32k* | *64k* | *32k* |
+| ***Mathematics*** | AIME 2024 | 83.3 | 86.0 | 91.6 | 92.0 | 76.0 | 86.7 | 79.8 | 91.4 | 85.7 |
+| | AIME 2025 | 74.6 | 76.9 | 88.9 | 88.0 | 75.5 | 74.0 | 70.0 | 87.5 | 81.5 |
+| | MATH-500 | 96.0 | 96.8 | 98.1 | 98.8 | 98.2 | 96.7 | 97.3 | 98.0 | 96.2 |
+| ***General Coding*** | LiveCodeBench *(24/8~25/5)* | 62.3 | 65.0 | 75.8 | 77.1 | 56.6 | 67.5 | 55.9 | 73.1 | 65.9 |
+| | FullStackBench | 67.6 | 68.3 | 69.3 | -- | 70.3 | 69.9 | 70.1 | 69.4 | 62.9 |
+| ***Reasoning & Knowledge***| GPQA Diamond | 69.2 | 70.0 | 83.3 | 86.4 | 79.6 | 77.3 | 71.5 | 81.0 | 71.1 |
+| | HLE *(no tools)* | 7.2\* | 8.4\* | 20.3 | 21.6 | 10.7 | 8.2 | 8.6\* | 17.7\* | 7.6\* |
+| | ZebraLogic | 80.1 | 86.8 | 95.8 | 91.6 | 95.1 | 84.4 | 78.7 | 95.1 | 80.3 |
+| | MMLU-Pro | 80.6 | 81.1 | 85.0 | 86.0 | 85.0 | 87.0 | 84.0 | 85.0 | 83.0 |
+| ***Software Engineering***| SWE-bench Verified| 55.6 | 56.0 | 69.1 | 67.2 | 72.5 | 47.0 | 49.2 | 57.6 | 34.4 |
+| ***Long Context*** | OpenAI-MRCR *(128k)* | 76.1 | 73.4 | 56.5 | 76.8 | 48.9 | 54.3 | 35.8 | 51.5 | 27.7 |
+| | OpenAI-MRCR *(1M)* | 58.6 | 56.2 | -- | 58.8 | -- | -- | -- | -- | -- |
+| | LongBench-v2 | 61.0 | 61.5 | 58.8 | 65.0 | 55.6 | 52.5 | 58.3 | 52.1 | 50.1 |
+| ***Agentic Tool Use***| TAU-bench *(airline)* | 60.0 | 62.0 | 52.0 | 50.0 | 59.6 | 44.0 | -- | 53.5 | 34.7 |
+| | TAU-bench *(retail)* | 67.8 | 63.5 | 73.9 | 67.0 | 81.4 | 55.7 | -- | 63.9 | 58.6 |
+| ***Factuality*** | SimpleQA | 17.9 | 18.5 | 49.4 | 54.0 | -- | 12.9 | 30.1 | 27.8 | 11.0 |
+| ***General Assistant***| MultiChallenge | 44.7 | 44.7 | 56.5 | 51.8 | 45.8 | 43.0 | 40.7 | 45.0 | 40.0 |
 \* conducted on the text-only HLE subset.