Instructions to use lmstudio-community/MiniMax-M2-MLX-6bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use lmstudio-community/MiniMax-M2-MLX-6bit with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="lmstudio-community/MiniMax-M2-MLX-6bit")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("lmstudio-community/MiniMax-M2-MLX-6bit")
model = AutoModelForCausalLM.from_pretrained("lmstudio-community/MiniMax-M2-MLX-6bit")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

MLX

How to use lmstudio-community/MiniMax-M2-MLX-6bit with MLX:

# Make sure mlx-lm is installed
# pip install --upgrade mlx-lm

# Generate text with mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("lmstudio-community/MiniMax-M2-MLX-6bit")

prompt = "Write a story about Einstein"
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True
)

text = generate(model, tokenizer, prompt=prompt, verbose=True)

Notebooks
Google Colab
Kaggle
Local Apps Settings
LM Studio

vLLM

How to use lmstudio-community/MiniMax-M2-MLX-6bit with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "lmstudio-community/MiniMax-M2-MLX-6bit"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "lmstudio-community/MiniMax-M2-MLX-6bit",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/lmstudio-community/MiniMax-M2-MLX-6bit

SGLang

How to use lmstudio-community/MiniMax-M2-MLX-6bit with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "lmstudio-community/MiniMax-M2-MLX-6bit" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "lmstudio-community/MiniMax-M2-MLX-6bit",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "lmstudio-community/MiniMax-M2-MLX-6bit" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "lmstudio-community/MiniMax-M2-MLX-6bit",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

How to use lmstudio-community/MiniMax-M2-MLX-6bit with Pi:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "lmstudio-community/MiniMax-M2-MLX-6bit"

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "mlx-lm": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "lmstudio-community/MiniMax-M2-MLX-6bit"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use lmstudio-community/MiniMax-M2-MLX-6bit with Hermes Agent:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "lmstudio-community/MiniMax-M2-MLX-6bit"

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default lmstudio-community/MiniMax-M2-MLX-6bit

Run Hermes

hermes

MLX LM

How to use lmstudio-community/MiniMax-M2-MLX-6bit with MLX LM:

Generate or start a chat session

# Install MLX LM
uv tool install mlx-lm
# Interactive chat REPL
mlx_lm.chat --model "lmstudio-community/MiniMax-M2-MLX-6bit"

Run an OpenAI-compatible server

# Install MLX LM
uv tool install mlx-lm
# Start the server
mlx_lm.server --model "lmstudio-community/MiniMax-M2-MLX-6bit"
# Calling the OpenAI-compatible server with curl
curl -X POST "http://localhost:8000/v1/chat/completions" \
   -H "Content-Type: application/json" \
   --data '{
     "model": "lmstudio-community/MiniMax-M2-MLX-6bit",
     "messages": [
       {"role": "user", "content": "Hello"}
     ]
   }'

Docker Model Runner
How to use lmstudio-community/MiniMax-M2-MLX-6bit with Docker Model Runner:
```
docker model run hf.co/lmstudio-community/MiniMax-M2-MLX-6bit
```

lmmy commited on Oct 29, 2025

Commit

c9e9a9d

verified ·

1 Parent(s): 110e792

Add files using upload-large-folder tool

Browse files

Files changed (50) hide show

.gitattributes +1 -0
README.md +27 -0
added_tokens.json +56 -0
chat_template.jinja +159 -0
config.json +614 -0
generation_config.json +7 -0
merges.txt +0 -0
model-00001-of-00038.safetensors +3 -0
model-00002-of-00038.safetensors +3 -0
model-00003-of-00038.safetensors +3 -0
model-00004-of-00038.safetensors +3 -0
model-00005-of-00038.safetensors +3 -0
model-00006-of-00038.safetensors +3 -0
model-00007-of-00038.safetensors +3 -0
model-00008-of-00038.safetensors +3 -0
model-00009-of-00038.safetensors +3 -0
model-00010-of-00038.safetensors +3 -0
model-00011-of-00038.safetensors +3 -0
model-00012-of-00038.safetensors +3 -0
model-00013-of-00038.safetensors +3 -0
model-00014-of-00038.safetensors +3 -0
model-00015-of-00038.safetensors +3 -0
model-00016-of-00038.safetensors +3 -0
model-00017-of-00038.safetensors +3 -0
model-00018-of-00038.safetensors +3 -0
model-00019-of-00038.safetensors +3 -0
model-00020-of-00038.safetensors +3 -0
model-00021-of-00038.safetensors +3 -0
model-00022-of-00038.safetensors +3 -0
model-00023-of-00038.safetensors +3 -0
model-00024-of-00038.safetensors +3 -0
model-00025-of-00038.safetensors +3 -0
model-00026-of-00038.safetensors +3 -0
model-00027-of-00038.safetensors +3 -0
model-00028-of-00038.safetensors +3 -0
model-00029-of-00038.safetensors +3 -0
model-00030-of-00038.safetensors +3 -0
model-00031-of-00038.safetensors +3 -0
model-00032-of-00038.safetensors +3 -0
model-00033-of-00038.safetensors +3 -0
model-00034-of-00038.safetensors +3 -0
model-00035-of-00038.safetensors +3 -0
model-00036-of-00038.safetensors +3 -0
model-00037-of-00038.safetensors +3 -0
model-00038-of-00038.safetensors +3 -0
model.safetensors.index.json +0 -0
special_tokens_map.json +75 -0
tokenizer.json +3 -0
tokenizer_config.json +497 -0
vocab.json +0 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+tokenizer.json filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,27 @@

+---
+pipeline_tag: text-generation
+license: mit
+library_name: transformers
+tags:
+- mlx
+base_model: MiniMaxAI/MiniMax-M2
+---
+## 💫 Community Model> MiniMax-M2 by MiniMaxAI
+_👾 [LM Studio](https://lmstudio.ai) Community models highlights program. Highlighting new & noteworthy models by the community. Join the conversation on [Discord](https://discord.gg/aPQfnNkxGC)_.
+**Model creator**: [MiniMaxAI](https://huggingface.co/MiniMaxAI)<br>
+**Original model**: [MiniMax-M2](https://huggingface.co/MiniMaxAI/MiniMax-M2)<br>
+**MLX quantization**: provided by [LM Studio team](https://x.com/lmstudio) using [mlx_lm](https://github.com/ml-explore/mlx-lm)<br>
+## Technical Details
+6-bit quantized version of MiniMax-M2 using MLX, optimized for Apple Silicon.
+## Special thanks
+🙏 Special thanks to the [Apple Machine Learning Research](https://github.com/ml-explore) team for creating [MLX](https://github.com/ml-explore/mlx).
+## Disclaimers
+LM Studio is not the creator, originator, or owner of any Model featured in the Community Model Program. Each Community Model is created and provided by third parties. LM Studio does not endorse, support, represent or guarantee the completeness, truthfulness, accuracy, or reliability of any Community Model. You understand that Community Models can produce content that might be offensive, harmful, inaccurate or otherwise inappropriate, or deceptive. Each Community Model is the sole responsibility of the person or entity who originated such Model. LM Studio may not monitor or control the Community Models and cannot, and does not, take responsibility for any such Model. LM Studio disclaims all warranties or guarantees about the accuracy, reliability or benefits of the Community Models. LM Studio further disclaims any warranty that the Community Model will meet your requirements, be secure, uninterrupted or available at any time or location, or error-free, viruses-free, or that any errors will be corrected, or otherwise. You will be solely responsible for any damage resulting from your use of or access to the Community Models, your downloading of any Community Model, or use of any other Community Model provided by or through LM Studio.

added_tokens.json ADDED Viewed

	@@ -0,0 +1,56 @@

+{
+  "</minimax:tool_call>": 200053,
+  "</think>": 200051,
+  "<add_file>": 200036,
+  "<code_context>": 200043,
+  "<code_interpreter>": 200023,
+  "<commit_after>": 200018,
+  "<commit_before>": 200016,
+  "<commit_message>": 200040,
+  "<commit_msg>": 200017,
+  "<delete_file>": 200037,
+  "<edit_file>": 200039,
+  "<empty_output>": 200015,
+  "<empty_source_file>": 200041,
+  "<file_content>": 200044,
+  "<file_sep>": 200049,
+  "<filename>": 200006,
+  "<filepath>": 200048,
+  "<fim_middle>": 200002,
+  "<fim_pad>": 200004,
+  "<fim_prefix>": 200001,
+  "<fim_suffix>": 200003,
+  "<function_call>": 200022,
+  "<gh_stars>": 200007,
+  "<issue_closed>": 200010,
+  "<issue_comment>": 200009,
+  "<issue_start>": 200008,
+  "<jupyter_code>": 200013,
+  "<jupyter_error>": 200035,
+  "<jupyter_output>": 200014,
+  "<jupyter_start>": 200011,
+  "<jupyter_text>": 200012,
+  "<minimax:tool_call>": 200052,
+  "<pr_start>": 200046,
+  "<rename_file>": 200038,
+  "<repo_struct>": 200042,
+  "<reponame>": 200005,
+  "<review_comment>": 200047,
+  "<source_files>": 200045,
+  "<think>": 200050,
+  "[e~[": 200020,
+  "]!d~[": 200021,
+  "]!p~[": 200000,
+  "]<]end of image[>[": 200030,
+  "]<]end of speech[>[": 200028,
+  "]<]end of video[>[": 200032,
+  "]<]image[>[": 200025,
+  "]<]speech[>[": 200024,
+  "]<]start of image[>[": 200029,
+  "]<]start of speech[>[": 200027,
+  "]<]start of video[>[": 200031,
+  "]<]video[>[": 200026,
+  "]<]vision pad[>[": 200033,
+  "]~!b[": 200034,
+  "]~b]": 200019
+}

chat_template.jinja ADDED Viewed

	@@ -0,0 +1,159 @@

+{# ----------‑‑‑ special token variables ‑‑‑---------- #}
+{%- set toolcall_begin_token   = '<minimax:tool_call>'         -%}
+{%- set toolcall_end_token     = '</minimax:tool_call>'        -%}
+{#- Tool Rendering Functions ============================================== -#}
+{%- macro render_tool_namespace(namespace_name, tool_list) -%}
+{%- for tool in tool_list -%}
+<tool>{{ tool.function | tojson(ensure_ascii=False) }}</tool>
+{% endfor -%}
+{%- endmacro -%}
+{%- macro visible_text(content) -%}
+    {%- if content is string -%}
+        {{ content }}
+    {%- elif content is iterable and content is not mapping -%}
+        {%- for item in content -%}
+            {%- if item is mapping and item.type == 'text' -%}
+                {{- item.text }}
+            {%- elif item is string -%}
+                {{- item }}
+            {%- endif -%}
+        {%- endfor -%}
+    {%- else -%}
+        {{- content }}
+    {%- endif -%}
+{%- endmacro -%}
+{#- System Message Construction ============================================ -#}
+{%- macro build_system_message(system_message) -%}
+    {%- if system_message and system_message.content -%}
+        {{- visible_text(system_message.content) }}
+    {%- else -%}
+        {%- if model_identity is not defined -%}
+            {%- set model_identity = "You are a helpful assistant." -%}
+        {%- endif -%}
+        {{- model_identity }}
+    {%- endif -%}
+    {#- Handle current_date -#}
+    {%- if system_message and system_message.current_date -%}
+        {{- '\n' ~ 'Current date: ' + system_message.current_date }}
+    {%- endif -%}
+    {#- Handle current_location -#}
+    {%- if system_message and system_message.current_location -%}
+        {{- '\n' ~ 'Current location: ' + system_message.current_location }}
+    {%- endif -%}
+{%- endmacro -%}
+{#- Main Template Logic ================================================= -#}
+{#- Extract system message (only first message if it's system) -#}
+{%- set system_message = none -%}
+{%- set conversation_messages = messages -%}
+{%- if messages and messages[0].role == "system" -%}
+    {%- set system_message = messages[0] -%}
+    {%- set conversation_messages = messages[1:] -%}
+{%- endif -%}
+{#- Get the last user message turn, for interleved thinking -#}
+{%- set ns = namespace(last_user_index=-1) %}
+{% for m in conversation_messages %}
+    {%- if m.role == 'user' %}
+        {% set ns.last_user_index = loop.index0 -%}
+    {%- endif %}
+{%- endfor %}
+{#- Render system message -#}
+{{- ']~!b[' ~ ']~b]system' ~ '\n' }}
+{{- build_system_message(system_message) }}
+{#- Render tools if available -#}
+{%- if tools -%}
+    {{- '\n\n' ~ '# Tools' ~ '\n' ~ 'You may call one or more tools to assist with the user query.\nHere are the tools available in JSONSchema format:' ~ '\n' }}
+    {{- '\n' ~ '<tools>' ~ '\n' }}
+    {{- render_tool_namespace("functions", tools) }}
+    {{- '</tools>' ~ '\n\n' }}
+{{- 'When making tool calls, use XML format to invoke tools and pass parameters:' ~ '\n' }}
+{{- '\n' ~ toolcall_begin_token }}
+<invoke name="tool-name-1">
+<parameter name="param-key-1">param-value-1</parameter>
+<parameter name="param-key-2">param-value-2</parameter>
+...
+</invoke>
+{{- '\n' ~ toolcall_end_token }}
+{%- endif -%}
+{{- '[e~[\n' }}
+{#- Render messages -#}
+{%- set last_tool_call = namespace(name=none) -%}
+{%- for message in conversation_messages -%}
+    {%- if message.role == 'assistant' -%}
+        {#- Only render reasoning_content if no user message follows -#}
+        {{- ']~b]ai' ~ '\n' }}
+        {%- set reasoning_content = '' %}
+        {%- set content = visible_text(message.content) %}
+        {%- if message.reasoning_content is string %}
+            {%- set reasoning_content = message.reasoning_content %}
+        {%- else %}
+            {%- if '</think>' in content %}
+                {%- set reasoning_content = content.split('</think>')[0].strip('\n').split('<think>')[-1].strip('\n') %}
+                {%- set content = content.split('</think>')[-1].strip('\n') %}
+            {%- endif %}
+        {%- endif %}
+        {%- if reasoning_content and loop.index0 > ns.last_user_index -%}
+            {{- '<think>' ~ '\n' ~ reasoning_content ~ '\n' ~ '</think>' ~ '\n\n' }}
+        {%- endif -%}
+        {%- if content -%}
+            {{- content }}
+        {%- endif -%}
+        {%- if message.tool_calls -%}
+            {{- '\n' ~ toolcall_begin_token ~ '\n' }}
+            {%- for tool_call in message.tool_calls -%}
+                {%- if tool_call.function %}
+                    {%- set tool_call = tool_call.function %}
+                {%- endif %}
+                {{- '<invoke name="' + tool_call.name + '">' }}
+                {% set _args = tool_call.arguments %}
+                {%- for k, v in _args.items() %}
+                {{- '<parameter name="' + k + '">' }}
+                {{- v | tojson(ensure_ascii=False) if v is not string else v }}
+                {{- '</parameter>' }}
+                {% endfor %}
+                {{- '</invoke>' ~ '\n' }}
+            {%- endfor -%}
+            {{- toolcall_end_token}}
+            {%- set last_tool_call.name = message.tool_calls[-1].name -%}
+        {%- else -%}
+            {%- set last_tool_call.name = none -%}
+        {%- endif -%}
+        {{- '[e~[' ~ '\n' }}
+    {%- elif message.role == 'tool' -%}
+    {%- if last_tool_call.name is none -%}
+        {{- raise_exception("Message has tool role, but there was no previous assistant message with a tool call!") }}
+    {%- endif -%}
+    {%- if loop.first or (conversation_messages[loop.index0 - 1].role != 'tool') -%}
+        {{- ']~b]tool' }}
+    {%- endif -%}
+    {%- if message.content is string -%}
+        {{- '\n<response>' }}
+        {{- message.content }}
+        {{- '</response>' }}
+    {%- else -%}
+        {%- for tr in message.content -%}
+            {{- '\n<response>' }}
+            {{- tr.output if tr.output is defined else (tr.text if tr.type == 'text' and tr.text is defined else tr) }}
+            {{- '\n</response>' }}
+        {%- endfor -%}
+    {%- endif -%}
+    {%- if loop.last or (conversation_messages[loop.index0 + 1].role != 'tool') -%}
+        {{- '[e~[\n' -}}
+    {%- endif -%}
+    {%- elif message.role == 'user' -%}
+        {{- ']~b]user' ~ '\n' }}
+        {{- visible_text(message.content) }}
+        {{- '[e~[' ~ '\n' }}
+    {%- endif -%}
+{%- endfor -%}
+{#- Generation prompt -#}
+{%- if add_generation_prompt -%}
+{{- ']~b]ai' ~ '\n' ~ '<think>' ~ '\n' }}
+{%- endif -%}

config.json ADDED Viewed

	@@ -0,0 +1,614 @@

+{
+    "architectures": [
+        "MiniMaxM2ForCausalLM"
+    ],
+    "attention_dropout": 0.0,
+    "attn_type_list": [
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1,
+        1
+    ],
+    "bos_token_id": null,
+    "eos_token_id": null,
+    "head_dim": 128,
+    "hidden_act": "silu",
+    "hidden_size": 3072,
+    "initializer_range": 0.02,
+    "intermediate_size": 1536,
+    "layernorm_full_attention_beta": 1.0,
+    "layernorm_linear_attention_beta": 1.0,
+    "layernorm_mlp_beta": 1.0,
+    "max_position_embeddings": 196608,
+    "mlp_intermediate_size": 8192,
+    "model_type": "minimax",
+    "mtp_transformer_layers": 1,
+    "num_attention_heads": 48,
+    "num_experts_per_tok": 8,
+    "num_hidden_layers": 62,
+    "num_key_value_heads": 8,
+    "num_local_experts": 256,
+    "num_mtp_modules": 3,
+    "output_router_logits": false,
+    "qk_norm_type": "per_layer",
+    "quantization": {
+        "group_size": 64,
+        "bits": 6,
+        "mode": "affine",
+        "model.layers.0.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.1.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.2.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.3.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.4.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.5.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.6.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.7.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.8.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.9.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.10.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.11.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.12.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.13.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.14.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.15.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.16.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.17.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.18.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.19.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.20.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.21.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.22.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.23.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.24.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.25.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.26.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.27.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.28.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.29.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.30.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.31.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.32.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.33.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.34.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.35.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.36.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.37.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.38.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.39.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.40.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.41.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.42.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.43.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.44.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.45.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.46.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.47.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.48.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.49.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.50.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.51.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.52.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.53.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.54.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.55.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.56.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.57.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.58.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.59.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.60.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.61.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        }
+    },
+    "quantization_config": {
+        "group_size": 64,
+        "bits": 6,
+        "mode": "affine",
+        "model.layers.0.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.1.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.2.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.3.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.4.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.5.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.6.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.7.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.8.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.9.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.10.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.11.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.12.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.13.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.14.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.15.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.16.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.17.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.18.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.19.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.20.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.21.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.22.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.23.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.24.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.25.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.26.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.27.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.28.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.29.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.30.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.31.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.32.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.33.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.34.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.35.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.36.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.37.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.38.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.39.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.40.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.41.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.42.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.43.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.44.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.45.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.46.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.47.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.48.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.49.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.50.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.51.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.52.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.53.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.54.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.55.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.56.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.57.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.58.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.59.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.60.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        },
+        "model.layers.61.block_sparse_moe.gate": {
+            "group_size": 64,
+            "bits": 8
+        }
+    },
+    "rms_norm_eps": 1e-06,
+    "rope_theta": 5000000,
+    "rotary_dim": 64,
+    "router_aux_loss_coef": 0.001,
+    "router_jitter_noise": 0.0,
+    "scoring_func": "sigmoid",
+    "shared_intermediate_size": 0,
+    "shared_moe_mode": "sigmoid",
+    "sliding_window": null,
+    "tie_word_embeddings": false,
+    "transformers_version": "4.46.1",
+    "use_cache": true,
+    "use_mtp": true,
+    "use_qk_norm": true,
+    "use_routing_bias": true,
+    "vocab_size": 200064
+}

generation_config.json ADDED Viewed

	@@ -0,0 +1,7 @@

+{
+  "do_sample": true,
+  "temperature": 1.0,
+  "top_p": 0.95,
+  "top_k": 40,
+  "transformers_version": "4.46.1"
+}

merges.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

model-00001-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d132f3f45b669b687df185c1f8664a255eb3217fb4de74686d4746e040f6099b
+size 4498611286

model-00002-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:585ec5cc3024d5425b87488e732005fb15a8a3f120a8b8eb872ae57c08493f68
+size 4980732462

model-00003-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:cb9e7528ea5afdd712ce860d7554c0da539924c02c7d74e4aab507bb3966b04d
+size 4944035148

model-00004-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ecf4dc0cc50a5a71a29fdb329337c060cb72ccaeb6325e7120f9df0fc54e6285
+size 4980732494

model-00005-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:14e13e2a9fb42ee57d0b8286a7d82b4366bcc066051a9d76cc06b898e7733c52
+size 4980732440

model-00006-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9cb379ce0497a3a9482b4934d5a52fe66915d8a813ee0685f49a233d033619fe
+size 4944035148

model-00007-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0acc3bc7748ca4d912addf995084328920d9337201e246a4cd7548099bfab619
+size 4980732543

model-00008-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:76b15447a277ced959c9fce219e66e4dc66f3756ecac458d80f88f6548ca0386
+size 4980732511

model-00009-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:492fbe4e41d20d0ec22edff0b89257606cab8f16b7b42ab42484565906cfc661
+size 4944035183

model-00010-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d5580b98c2866f3b9891e564c103a0ef6618cd12d07d481400b2f8836b05f058
+size 4980732533

model-00011-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3e86ccd87baf3791c93a79098ea6d263a543ca900d66f300ade171a824c91b2c
+size 4980732501

model-00012-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:392f5dc656c7e882d72f926b13f721f2228ed99c945461de1d45f053933918c8
+size 4944035153

model-00013-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:65065e81ac815d0ec69b70f3073b18d84423db0dc8b651e0790d73cc90e64c3c
+size 4980732517

model-00014-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c5af2c5ae5c040a4608de192cf524d728f9019c81bbe206cb53bfbb93431fc24
+size 4980732513

model-00015-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e3e42e69ab3c4890b5a4aabc61460f2611d3a8836b6b71a0f687df2a786ba956
+size 4944035183

model-00016-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3b392c6a3390007463cb5b62a324eca66431e62ed6b05ffe57ba50c559091bb9
+size 4980732545

model-00017-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:84f5f9e80a5ed182849269fef314defe45340924fdcf355fd86da819150fda68
+size 4980732511

model-00018-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:92922013ea374800ecb6760e29b14967ebeb662a94e966418bb0f2b774442f34
+size 4944035183

model-00019-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a3979a295f28ccd2295cf95103f342331d46e7f8f453b5165411506f0e2c7a14
+size 4980732515

model-00020-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8a9d4959fe9f5c262df41cee3d05f32e3bb87f63612ebd63bd8536774d63739c
+size 4980732513

model-00021-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b16143b39042d3ec0d1370c2f0d16620305a792d879b11e706ac16f591d6b939
+size 4944035175

model-00022-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a0b3cc63114f085e4da179006306287ba559265958029072eb7da1288e3ca1fb
+size 4980732545

model-00023-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:51fae3d83a6955b6906e2474229f5c524ba0cf45bfa43190d5deaf6365347a5f
+size 4980732493

model-00024-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4ea57b899408f51f6965885cc64d03a4377c84da84a90d8717a38decec37cac9
+size 4944035173

model-00025-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:922e77ac5ec7eb71eb31c0608af1550d247cdaa264c991ea179068f2eda8b92c
+size 4980732553

model-00026-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f0995a21a08bc677c1c7faf753f42d5dd073636acec247efa69111a5eae1b500
+size 4980732497

model-00027-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1ffaa3c59793a5413f8be7b0301b067765d8ef2baada49142662925dc4f8bbdb
+size 4944035181

model-00028-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c60f4cecc0b39e7e059d9bafecb739d3210182b305f72ad804bf6388c6aae539
+size 4980732561

model-00029-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4b83da9e839a3835fa638ae1ce14d0c19fb837b26d7ca94f597665ea06b5c5ac
+size 4980732505

model-00030-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ca8930769abfcbba15c03e6bd1c07dda302722cbef9ea9b2a5ae228a9512bac1
+size 4944035169

model-00031-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9767c1c00f0ac795614d3eb243e1043ef8d98df070b6f4deec793974169a09d3
+size 4980732535

model-00032-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9287bd5f61a6e042f38d08c6193248779b6733d36b2194e1491c6d211d08419c
+size 4980732489

model-00033-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2e201fa92ae674c15c9b8a4e73d9ad54b992b8fe30d2e5484b9b9d33b8cf4bb3
+size 4944035187

model-00034-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:69143bfc34eb6ab0d26cd3310af1b19ef8c6cf8549d755776984ad317d193b11
+size 4980732551

model-00035-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1aa34c4777e93e6f833604b8055de4ce3ba2667f8bae7588de30befbf4b318d1
+size 4980732489

model-00036-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4e9373ed9e2e4593ef0d7b4deb1291fa0f6757e76639b7f42b13b0d447696da4
+size 4944035185

model-00037-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f8fb67536eafcf5dc2b7575fe0254fdea23f6e949b811cc47603ce0a938e56e9
+size 4980732521

model-00038-of-00038.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e7480f64d9d77a0ca39f43f571dbeccb2e1a70be0868d082fc93724f5a6c9e0c
+size 2462315047

model.safetensors.index.json ADDED Viewed

The diff for this file is too large to render. See raw diff

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,75 @@

+{
+  "additional_special_tokens": [
+    "<code_interpreter>",
+    "<commit_after>",
+    "<commit_before>",
+    "<commit_msg>",
+    "<empty_output>",
+    "<filename>",
+    "<fim_middle>",
+    "<fim_pad>",
+    "<fim_prefix>",
+    "<fim_suffix>",
+    "<function_call>",
+    "<gh_stars>",
+    "]<]speech[>[",
+    "]<]image[>[",
+    "]<]video[>[",
+    "]<]start of speech[>[",
+    "]<]end of speech[>[",
+    "]<]start of image[>[",
+    "]<]end of image[>[",
+    "]<]start of video[>[",
+    "]<]end of video[>[",
+    "]<]vision pad[>[",
+    "]~!b[",
+    "<issue_closed>",
+    "<issue_comment>",
+    "<issue_start>",
+    "<jupyter_code>",
+    "<jupyter_output>",
+    "<jupyter_start>",
+    "<jupyter_text>",
+    "<reponame>",
+    "[e~[",
+    "]!d~[",
+    "]!p~[",
+    "]~b]",
+    "<jupyter_error>",
+    "<add_file>",
+    "<delete_file>",
+    "<rename_file>",
+    "<edit_file>",
+    "<commit_message>",
+    "<empty_source_file>",
+    "<repo_struct>",
+    "<code_context>",
+    "<file_content>",
+    "<source_files>",
+    "<pr_start>",
+    "<review_comment>",
+    "<filepath>",
+    "<file_sep>"
+  ],
+  "bos_token": {
+    "content": "]~!b[",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "[e~[",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "]!d~[",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e7b90ed7f55d905175bc26771d6d7d33b40b46742f073675bc816fedaf482ea1
+size 15522763

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,497 @@

+{
+  "add_prefix_space": false,
+  "added_tokens_decoder": {
+    "200000": {
+      "content": "]!p~[",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200001": {
+      "content": "<fim_prefix>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200002": {
+      "content": "<fim_middle>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200003": {
+      "content": "<fim_suffix>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200004": {
+      "content": "<fim_pad>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200005": {
+      "content": "<reponame>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200006": {
+      "content": "<filename>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200007": {
+      "content": "<gh_stars>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200008": {
+      "content": "<issue_start>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200009": {
+      "content": "<issue_comment>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200010": {
+      "content": "<issue_closed>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200011": {
+      "content": "<jupyter_start>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200012": {
+      "content": "<jupyter_text>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200013": {
+      "content": "<jupyter_code>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200014": {
+      "content": "<jupyter_output>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200015": {
+      "content": "<empty_output>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200016": {
+      "content": "<commit_before>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200017": {
+      "content": "<commit_msg>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200018": {
+      "content": "<commit_after>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200019": {
+      "content": "]~b]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200020": {
+      "content": "[e~[",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200021": {
+      "content": "]!d~[",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200022": {
+      "content": "<function_call>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200023": {
+      "content": "<code_interpreter>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200024": {
+      "content": "]<]speech[>[",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200025": {
+      "content": "]<]image[>[",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200026": {
+      "content": "]<]video[>[",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200027": {
+      "content": "]<]start of speech[>[",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200028": {
+      "content": "]<]end of speech[>[",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200029": {
+      "content": "]<]start of image[>[",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200030": {
+      "content": "]<]end of image[>[",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200031": {
+      "content": "]<]start of video[>[",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200032": {
+      "content": "]<]end of video[>[",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200033": {
+      "content": "]<]vision pad[>[",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200034": {
+      "content": "]~!b[",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200035": {
+      "content": "<jupyter_error>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200036": {
+      "content": "<add_file>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200037": {
+      "content": "<delete_file>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200038": {
+      "content": "<rename_file>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200039": {
+      "content": "<edit_file>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200040": {
+      "content": "<commit_message>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200041": {
+      "content": "<empty_source_file>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200042": {
+      "content": "<repo_struct>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200043": {
+      "content": "<code_context>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200044": {
+      "content": "<file_content>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200045": {
+      "content": "<source_files>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200046": {
+      "content": "<pr_start>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200047": {
+      "content": "<review_comment>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200048": {
+      "content": "<filepath>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200049": {
+      "content": "<file_sep>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "200050": {
+      "content": "<think>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "200051": {
+      "content": "</think>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "200052": {
+      "content": "<minimax:tool_call>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "200053": {
+      "content": "</minimax:tool_call>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    }
+  },
+  "additional_special_tokens": [
+    "<code_interpreter>",
+    "<commit_after>",
+    "<commit_before>",
+    "<commit_msg>",
+    "<empty_output>",
+    "<filename>",
+    "<fim_middle>",
+    "<fim_pad>",
+    "<fim_prefix>",
+    "<fim_suffix>",
+    "<function_call>",
+    "<gh_stars>",
+    "]<]speech[>[",
+    "]<]image[>[",
+    "]<]video[>[",
+    "]<]start of speech[>[",
+    "]<]end of speech[>[",
+    "]<]start of image[>[",
+    "]<]end of image[>[",
+    "]<]start of video[>[",
+    "]<]end of video[>[",
+    "]<]vision pad[>[",
+    "]~!b[",
+    "<issue_closed>",
+    "<issue_comment>",
+    "<issue_start>",
+    "<jupyter_code>",
+    "<jupyter_output>",
+    "<jupyter_start>",
+    "<jupyter_text>",
+    "<reponame>",
+    "[e~[",
+    "]!d~[",
+    "]!p~[",
+    "]~b]",
+    "<jupyter_error>",
+    "<add_file>",
+    "<delete_file>",
+    "<rename_file>",
+    "<edit_file>",
+    "<commit_message>",
+    "<empty_source_file>",
+    "<repo_struct>",
+    "<code_context>",
+    "<file_content>",
+    "<source_files>",
+    "<pr_start>",
+    "<review_comment>",
+    "<filepath>",
+    "<file_sep>"
+  ],
+  "bos_token": "]~!b[",
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "[e~[",
+  "extra_special_tokens": {},
+  "model_max_length": 40960000,
+  "tokenizer_class": "GPT2Tokenizer",
+  "unk_token": "]!d~[",
+  "chat_template": "{# ----------\u2011\u2011\u2011 special token variables \u2011\u2011\u2011---------- #}\n{%- set toolcall_begin_token   = '<minimax:tool_call>'         -%}\n{%- set toolcall_end_token     = '</minimax:tool_call>'        -%}\n{#- Tool Rendering Functions ============================================== -#}\n{%- macro render_tool_namespace(namespace_name, tool_list) -%}\n{%- for tool in tool_list -%}\n<tool>{{ tool.function | tojson(ensure_ascii=False) }}</tool>\n{% endfor -%}\n{%- endmacro -%}\n{%- macro visible_text(content) -%}\n    {%- if content is string -%}\n        {{ content }}\n    {%- elif content is iterable and content is not mapping -%}\n        {%- for item in content -%}\n            {%- if item is mapping and item.type == 'text' -%}\n                {{- item.text }}\n            {%- elif item is string -%}\n                {{- item }}\n            {%- endif -%}\n        {%- endfor -%}\n    {%- else -%}\n        {{- content }}\n    {%- endif -%}\n{%- endmacro -%}\n{#- System Message Construction ============================================ -#}\n{%- macro build_system_message(system_message) -%}\n    {%- if system_message and system_message.content -%}\n        {{- visible_text(system_message.content) }}\n    {%- else -%}\n        {%- if model_identity is not defined -%}\n            {%- set model_identity = \"You are a helpful assistant.\" -%}\n        {%- endif -%}\n        {{- model_identity }}\n    {%- endif -%}\n    \n    {#- Handle current_date -#}\n    {%- if system_message and system_message.current_date -%}\n        {{- '\\n' ~ 'Current date: ' + system_message.current_date }}\n    {%- endif -%}\n    {#- Handle current_location -#}\n    {%- if system_message and system_message.current_location -%}\n        {{- '\\n' ~ 'Current location: ' + system_message.current_location }}\n    {%- endif -%}\n{%- endmacro -%}\n{#- Main Template Logic ================================================= -#}\n{#- Extract system message (only first message if it's system) -#}\n{%- set system_message = none -%}\n{%- set conversation_messages = messages -%}\n{%- if messages and messages[0].role == \"system\" -%}\n    {%- set system_message = messages[0] -%}\n    {%- set conversation_messages = messages[1:] -%}\n{%- endif -%}\n{#- Get the last user message turn, for interleved thinking -#}\n{%- set ns = namespace(last_user_index=-1) %}\n{% for m in conversation_messages %}\n    {%- if m.role == 'user' %}\n        {% set ns.last_user_index = loop.index0 -%}\n    {%- endif %}\n{%- endfor %}\n{#- Render system message -#}\n{{- ']~!b[' ~ ']~b]system' ~ '\\n' }}\n{{- build_system_message(system_message) }}\n{#- Render tools if available -#}\n{%- if tools -%}\n    {{- '\\n\\n' ~ '# Tools' ~ '\\n' ~ 'You may call one or more tools to assist with the user query.\\nHere are the tools available in JSONSchema format:' ~ '\\n' }}\n    {{- '\\n' ~ '<tools>' ~ '\\n' }}\n    {{- render_tool_namespace(\"functions\", tools) }}\n    {{- '</tools>' ~ '\\n\\n' }}\n{{- 'When making tool calls, use XML format to invoke tools and pass parameters:' ~ '\\n' }}\n{{- '\\n' ~ toolcall_begin_token }}\n<invoke name=\"tool-name-1\">\n<parameter name=\"param-key-1\">param-value-1</parameter>\n<parameter name=\"param-key-2\">param-value-2</parameter>\n...\n</invoke>\n{{- '\\n' ~ toolcall_end_token }}\n{%- endif -%}\n{{- '[e~[\\n' }}\n\n{#- Render messages -#}\n{%- set last_tool_call = namespace(name=none) -%}\n{%- for message in conversation_messages -%}\n    {%- if message.role == 'assistant' -%}\n        {#- Only render reasoning_content if no user message follows -#}\n        {{- ']~b]ai' ~ '\\n' }}\n\n        {%- set reasoning_content = '' %}\n        {%- set content = visible_text(message.content) %}\n        {%- if message.reasoning_content is string %}\n            {%- set reasoning_content = message.reasoning_content %}\n        {%- else %}\n            {%- if '</think>' in content %}\n                {%- set reasoning_content = content.split('</think>')[0].strip('\\n').split('<think>')[-1].strip('\\n') %}\n                {%- set content = content.split('</think>')[-1].strip('\\n') %}\n            {%- endif %}\n        {%- endif %}\n        {%- if reasoning_content and loop.index0 > ns.last_user_index -%}\n            {{- '<think>' ~ '\\n' ~ reasoning_content ~ '\\n' ~ '</think>' ~ '\\n\\n' }}\n        {%- endif -%}\n        {%- if content -%}\n            {{- content }}\n        {%- endif -%}\n        {%- if message.tool_calls -%}\n            {{- '\\n' ~ toolcall_begin_token ~ '\\n' }}\n\n            {%- for tool_call in message.tool_calls -%}\n                {%- if tool_call.function %}\n                    {%- set tool_call = tool_call.function %}\n                {%- endif %}\n                {{- '<invoke name=\"' + tool_call.name + '\">' }}\n                {% set _args = tool_call.arguments %}\n                {%- for k, v in _args.items() %}\n                {{- '<parameter name=\"' + k + '\">' }}\n                {{- v | tojson(ensure_ascii=False) if v is not string else v }}\n                {{- '</parameter>' }}\n                {% endfor %}\n                {{- '</invoke>' ~ '\\n' }}\n            {%- endfor -%}\n            \n            {{- toolcall_end_token}}\n            {%- set last_tool_call.name = message.tool_calls[-1].name -%}\n        {%- else -%}\n            {%- set last_tool_call.name = none -%}\n        {%- endif -%}\n        {{- '[e~[' ~ '\\n' }}\n        \n    {%- elif message.role == 'tool' -%}\n    {%- if last_tool_call.name is none -%}\n        {{- raise_exception(\"Message has tool role, but there was no previous assistant message with a tool call!\") }}\n    {%- endif -%}\n    {%- if loop.first or (conversation_messages[loop.index0 - 1].role != 'tool') -%}\n        {{- ']~b]tool' }}\n    {%- endif -%}\n    {%- if message.content is string -%}\n        {{- '\\n<response>' }}\n        {{- message.content }}\n        {{- '</response>' }}\n    {%- else -%}\n        {%- for tr in message.content -%}\n            {{- '\\n<response>' }}\n            {{- tr.output if tr.output is defined else (tr.text if tr.type == 'text' and tr.text is defined else tr) }}\n            {{- '\\n</response>' }}\n        {%- endfor -%}\n    {%- endif -%}\n    {%- if loop.last or (conversation_messages[loop.index0 + 1].role != 'tool') -%}\n        {{- '[e~[\\n' -}}\n    {%- endif -%}\n        \n    {%- elif message.role == 'user' -%}\n        {{- ']~b]user' ~ '\\n' }}\n        {{- visible_text(message.content) }}\n        {{- '[e~[' ~ '\\n' }}\n    {%- endif -%}\n{%- endfor -%}\n\n{#- Generation prompt -#}\n{%- if add_generation_prompt -%}\n{{- ']~b]ai' ~ '\\n' ~ '<think>' ~ '\\n' }}\n{%- endif -%}\n"
+}

vocab.json ADDED Viewed

The diff for this file is too large to render. See raw diff