Duplicated from fno2010/MiniMax-M2.7-TQ3

KennethTang
/

MiniMax-M2.7-TQ3

8-bit precision

Model card Files Files and versions

MiniMax-M2.7-TQ3 / README.md

KennethTang's picture

Duplicate from fno2010/MiniMax-M2.7-TQ3

331a7ef 9 days ago

|

history blame contribute delete

1.85 kB

	---
	license: other
	base_model: MiniMaxAI/MiniMax-M2.7
	tags:
	- turboquant
	- quantization
	- 3-bit
	- vllm
	- mini-max
	---

	# MiniMax-M2.7-TQ3

	A TurboQuant 3-bit quantized version of [MiniMax-M2.7](https://huggingface.co/MiniMaxAI/MiniMax-M2.7), optimized for inference with [turboquant-vllm](https://github.com/varjoranta/turboquant-vllm).

	## Model Details

	- Base Model: MiniMaxAI/MiniMax-M2.7
	- Quantization: TurboQuant 3-bit
	- Quantization Tool: [turboquant-vllm](https://github.com/varjoranta/turboquant-vllm)
	- Architecture: Transformer-based LLM with extended context support

	## Usage

	This quantized model is designed to work with the turboquant-vllm inference engine. Please refer to the [turboquant-vllm repository](https://github.com/varjoranta/turboquant-vllm) for installation and usage instructions.

	### Example

	```python
	# Please refer to turboquant-vllm for proper model loading
	```

	## Chat Template

	The model uses a Jinja chat template with support for:
	- System messages
	- Tool/function calling (`<minimax:tool_call>` / `</minimax:tool_call>` delimiters)
	- Reasoning content (`<think>` / `</minimax:tool_call>` delimiters)
	- Multi-turn conversations

	The default model identity is: "You are a helpful assistant. Your name is MiniMax-M2.7 and is built by MiniMax."

	## Tokenizer

	- Backend: tokenizers
	- Vocabulary Size: (see tokenizer files)
	- Special Tokens: Includes tokens for tool calls, reasoning markers, and standard control tokens

	## Quantization Details

	This is a 3-bit quantized checkpoint intended for efficient inference. The quantization was applied using the TurboQuant method via the turboquant-vllm project.

	## Disclaimer

	This is a third-party quantized version of the original MiniMax-M2.7 model. Please refer to the original model card for base model details and licensing.