| --- |
| license: other |
| base_model: MiniMaxAI/MiniMax-M2.7 |
| tags: |
| - turboquant |
| - quantization |
| - 3-bit |
| - vllm |
| - mini-max |
| --- |
| |
| # MiniMax-M2.7-TQ3 |
|
|
| A **TurboQuant 3-bit** quantized version of [MiniMax-M2.7](https://huggingface.co/MiniMaxAI/MiniMax-M2.7), optimized for inference with [turboquant-vllm](https://github.com/varjoranta/turboquant-vllm). |
|
|
| ## Model Details |
|
|
| - **Base Model:** MiniMaxAI/MiniMax-M2.7 |
| - **Quantization:** TurboQuant 3-bit |
| - **Quantization Tool:** [turboquant-vllm](https://github.com/varjoranta/turboquant-vllm) |
| - **Architecture:** Transformer-based LLM with extended context support |
|
|
| ## Usage |
|
|
| This quantized model is designed to work with the turboquant-vllm inference engine. Please refer to the [turboquant-vllm repository](https://github.com/varjoranta/turboquant-vllm) for installation and usage instructions. |
|
|
| ### Example |
|
|
| ```python |
| # Please refer to turboquant-vllm for proper model loading |
| ``` |
|
|
| ## Chat Template |
|
|
| The model uses a Jinja chat template with support for: |
| - System messages |
| - Tool/function calling (`<minimax:tool_call>` / `</minimax:tool_call>` delimiters) |
| - Reasoning content (`<think>` / `</minimax:tool_call>` delimiters) |
| - Multi-turn conversations |
|
|
| The default model identity is: *"You are a helpful assistant. Your name is MiniMax-M2.7 and is built by MiniMax."* |
|
|
| ## Tokenizer |
|
|
| - **Backend:** tokenizers |
| - **Vocabulary Size:** (see tokenizer files) |
| - **Special Tokens:** Includes tokens for tool calls, reasoning markers, and standard control tokens |
|
|
| ## Quantization Details |
|
|
| This is a 3-bit quantized checkpoint intended for efficient inference. The quantization was applied using the TurboQuant method via the turboquant-vllm project. |
|
|
| ## Disclaimer |
|
|
| This is a third-party quantized version of the original MiniMax-M2.7 model. Please refer to the original model card for base model details and licensing. |
|
|