Instructions to use MiniMaxAI/MiniMax-M1-40k with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use MiniMaxAI/MiniMax-M1-40k with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="MiniMaxAI/MiniMax-M1-40k", trust_remote_code=True) messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("MiniMaxAI/MiniMax-M1-40k", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use MiniMaxAI/MiniMax-M1-40k with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "MiniMaxAI/MiniMax-M1-40k" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MiniMaxAI/MiniMax-M1-40k", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/MiniMaxAI/MiniMax-M1-40k
- SGLang
How to use MiniMaxAI/MiniMax-M1-40k with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "MiniMaxAI/MiniMax-M1-40k" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MiniMaxAI/MiniMax-M1-40k", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "MiniMaxAI/MiniMax-M1-40k" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MiniMaxAI/MiniMax-M1-40k", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use MiniMaxAI/MiniMax-M1-40k with Docker Model Runner:
docker model run hf.co/MiniMaxAI/MiniMax-M1-40k
Commit History
fix docs 3b20a9c
fix code 5fe30d5
Merge branch 'main' of https://huggingface.co/MiniMaxAI/MiniMax-M1-40k into main b1645e4
upload LICENSE 9b89b84
Merge branch 'main' of https://huggingface.co/MiniMaxAI/MiniMax-M1-40k into main 7014b9f
Update function_call_guide_cn.md 17002c5 verified
Update function_call_guide.md a7365d6 verified
Delete LICENSE-CODE e02c7f7 verified
fix code f0dd085
Delete LICENSE-MODEL 0b21985 verified
fix docs 9385f59
fix docs fdf6c12
fix docs eb3e70f
add docs 75b83bd
add model weight 6b33168
add model weight a257f69
add model weight b5a090c
update f45851d
Update tokenizer_config.json (#1) c1f6b16 verified
fix 1553534
qingjun commited on
fix code c0ea1e7
qingjun commited on
update b50f3ee
qingjun commited on
change text-01 -> M1 777447b
qingjun commited on
change files 9117507
change files 3a77662
add files b620f7e
Initial Commit 4e7757c
qingjun commited on