Instructions to use CreitinGameplays/Mistral-Nemo-12B-R1-v0.2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use CreitinGameplays/Mistral-Nemo-12B-R1-v0.2 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="CreitinGameplays/Mistral-Nemo-12B-R1-v0.2")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("CreitinGameplays/Mistral-Nemo-12B-R1-v0.2")
model = AutoModelForCausalLM.from_pretrained("CreitinGameplays/Mistral-Nemo-12B-R1-v0.2")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use CreitinGameplays/Mistral-Nemo-12B-R1-v0.2 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "CreitinGameplays/Mistral-Nemo-12B-R1-v0.2"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "CreitinGameplays/Mistral-Nemo-12B-R1-v0.2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/CreitinGameplays/Mistral-Nemo-12B-R1-v0.2

SGLang

How to use CreitinGameplays/Mistral-Nemo-12B-R1-v0.2 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "CreitinGameplays/Mistral-Nemo-12B-R1-v0.2" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "CreitinGameplays/Mistral-Nemo-12B-R1-v0.2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "CreitinGameplays/Mistral-Nemo-12B-R1-v0.2" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "CreitinGameplays/Mistral-Nemo-12B-R1-v0.2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use CreitinGameplays/Mistral-Nemo-12B-R1-v0.2 with Docker Model Runner:
```
docker model run hf.co/CreitinGameplays/Mistral-Nemo-12B-R1-v0.2
```

#

#28

by CreitinGameplays - opened Aug 6, 2025

base: refs/heads/main

←

from: refs/pr/28

Discussion Files changed

-1

Files changed (1) hide show

tokenizer_config.json +1 -1

tokenizer_config.json CHANGED Viewed

@@ -8005,7 +8005,7 @@
     }
   },
   "bos_token": "<s>",
-  "chat_template": "{%- set system_message = \"A user will ask you to solve a task. You should first draft your thinking process (inner monologue) until you have derived the final answer. Afterwards, write a self-contained summary of your thoughts (i.e. your summary should be succinct but contain all the critical steps you needed to reach the conclusion). You should use Markdown and Latex to format your response. Write both your thoughts and summary in the same language as the task posed by the user.\\n\\nYour thinking process must follow the template below:\\n<think>\\nYour thoughts or/and draft, like working through an exercise on scratch paper. Be as casual and as long as you want until you are confident to generate a correct answer.\\n</think>\\nHere, provide a concise summary that reflects your reasoning and presents a clear final answer to the user.\\nUser:\\n\" %}\n{%- set system_message_two = \"\\nA user will ask you to solve a task. You should first draft your thinking process (inner monologue) until you have derived the final answer. Afterwards, write a self-contained summary of your thoughts (i.e. your summary should be succinct but contain all the critical steps you needed to reach the conclusion). You should use Markdown and Latex to format your response. Write both your thoughts and summary in the same language as the task posed by the user.\\n\\nYour thinking process must follow the template below:\\n<think>\\nYour thoughts or/and draft, like working through an exercise on scratch paper. Be as casual and as long as you want until you are confident to generate a correct answer.\\n</think>\\nHere, provide a concise summary that reflects your reasoning and presents a clear final answer to the user.\" %}\n{# Removed external system-prompt exception: ignore any custom system prompts #}\n{%- if messages[0][\"role\"] == \"system\" %}  \n    {%- set system_message = messages[0][\"content\"] + system_message_two + \"\\nUser:\\n\" %}  \n    {%- set loop_messages = messages[1:] %}  \n{%- else %}  \n    {%- set loop_messages = messages %}  \n{%- endif %}  \n{%- if not tools is defined %}  \n    {%- set tools = none %}  \n{%- endif %}\n{%- set user_messages = loop_messages | selectattr(\"role\", \"equalto\", \"user\") | list %}  \n{#- This block checks for alternating user/assistant messages, skipping tool calling messages #}  \n{%- set ns = namespace() %}  \n{%- set ns.index = 0 %}  \n{%- for message in loop_messages %}  \n    {%- if not (message.role == \"tool\" or message.role == \"tool_results\" or (message.tool_calls is defined and message.tool_calls is not none)) %}  \n        {%- if (message[\"role\"] == \"user\") != (ns.index % 2 == 0) %}  \n            {{- raise_exception(\"After the optional system message, conversation roles must alternate user/assistant/user/assistant...\") }}  \n        {%- endif %}  \n        {%- set ns.index = ns.index + 1 %}  \n    {%- endif %}  \n{%- endfor %}\n{{- bos_token }}  \n{%- for message in loop_messages %}  \n    {%- if message[\"role\"] == \"user\" %}  \n        {%- if tools is not none and (message == user_messages[-1]) %}  \n            {{- \"[AVAILABLE_TOOLS] [\" }}  \n            {%- for tool in tools %}  \n                {%- set tool = tool.function %}  \n                {{- '{\"type\": \"function\", \"function\": {' }}  \n                {%- for key, val in tool.items() if key != \"return\" %}  \n                    {%- if val is string %}  \n                        {{- '\"' + key + '\": \"' + val + '\"' }}  \n                    {%- else %}  \n                        {{- '\"' + key + '\": ' + val|tojson }}  \n                    {%- endif %}  \n                    {%- if not loop.last %}  \n                        {{- \", \" }}  \n                    {%- endif %}  \n                {%- endfor %}  \n                {{- \"}}\" }}  \n                {%- if not loop.last %}  \n                    {{- \", \" }}  \n                {%- else %}  \n                    {{- \"]\" }}  \n                {%- endif %}  \n            {%- endfor %}  \n            {{- \"[/AVAILABLE_TOOLS]\" }}  \n        {%- endif %}  \n        {%- if  system_message is defined %}  \n            {{- \"[INST]\" + system_message + \"\\n\" + message[\"content\"] + \"[/INST]\\n\" }}  \n        {%- else %}  \n            {{- \"[INST]\" + message[\"content\"] + \"[/INST]\\n\" }}  \n        {%- endif %}  \n    {%- elif message.tool_calls is defined and message.tool_calls is not none %}  \n        {{- \"[TOOL_CALLS] [\" }}  \n        {%- for tool_call in message.tool_calls %}  \n            {%- set out = tool_call.function|tojson %}  \n            {{- out[:-1] }}  \n            {%- if not tool_call.id is defined or tool_call.id|length != 9 %}  \n                {{- raise_exception(\"Tool call IDs should be alphanumeric strings with length 9!\") }}  \n            {%- endif %}  \n            {{- ', \"id\": \"' + tool_call.id + '\"}' }}  \n            {%- if not loop.last %}  \n                {{- \", \" }}  \n            {%- else %}  \n                {{- \"]\" + eos_token }}  \n            {%- endif %}  \n        {%- endfor %}  \n    {%- elif message[\"role\"] == \"assistant\" %}  \n        {{- \"\" + message[\"content\"]|trim + eos_token + \"\\n\"}}  \n    {%- elif message[\"role\"] == \"tool_results\" or message[\"role\"] == \"tool\" %}  \n        {%- if message.content is defined and message.content.content is defined %}  \n            {%- set content = message.content.content %}  \n        {%- else %}  \n            {%- set content = message.content %}  \n        {%- endif %}  \n        {{- '[TOOL_RESULTS] {\"content\": ' + content|string + \", 'call_id': '\" + message.tool_call_id + \"'}[/TOOL_RESULTS]\" }}  \n    {%- else %}  \n        {{- raise_exception(\"Only user and assistant roles are supported, with the exception of an initial optional system message!\") }}  \n    {%- endif %}  \n{%- endfor %}",
   "clean_up_tokenization_spaces": false,
   "eos_token": "</s>",
   "extra_special_tokens": {},

     }
   },
   "bos_token": "<s>",
+  "chat_template": "{%- set system_message = \"A user will ask you to solve a task. You should first draft your thinking process (inner monologue) until you have derived the final answer. Afterwards, write a self-contained summary of your thoughts (i.e. your summary should be succinct but contain all the critical steps you needed to reach the conclusion). You should use Markdown and Latex to format your response. Write both your thoughts and summary in the same language as the task posed by the user.\\n\\nYour thinking process must follow the template below:\\n<think>\\nYour thoughts or/and draft, like working through an exercise on scratch paper. Be as casual and as long as you want until you are confident to generate a correct answer.\\n</think>\\nHere, provide a concise summary that reflects your reasoning and presents a clear final answer to the user.\\n\" %}\n{%- set system_message_two = \"\\nA user will ask you to solve a task. You should first draft your thinking process (inner monologue) until you have derived the final answer. Afterwards, write a self-contained summary of your thoughts (i.e. your summary should be succinct but contain all the critical steps you needed to reach the conclusion). You should use Markdown and Latex to format your response. Write both your thoughts and summary in the same language as the task posed by the user.\\n\\nYour thinking process must follow the template below:\\n<think>\\nYour thoughts or/and draft, like working through an exercise on scratch paper. Be as casual and as long as you want until you are confident to generate a correct answer.\\n</think>\\nHere, provide a concise summary that reflects your reasoning and presents a clear final answer to the user.\" %}\n{# Removed external system-prompt exception: ignore any custom system prompts #}\n{%- if messages[0][\"role\"] == \"system\" %}  \n    {%- set system_message = messages[0][\"content\"] + system_message_two + \"\\n\" %}  \n    {%- set loop_messages = messages[1:] %}  \n{%- else %}  \n    {%- set loop_messages = messages %}  \n{%- endif %}  \n{%- if not tools is defined %}  \n    {%- set tools = none %}  \n{%- endif %}\n{%- set user_messages = loop_messages | selectattr(\"role\", \"equalto\", \"user\") | list %}  \n{#- This block checks for alternating user/assistant messages, skipping tool calling messages #}  \n{%- set ns = namespace() %}  \n{%- set ns.index = 0 %}  \n{%- for message in loop_messages %}  \n    {%- if not (message.role == \"tool\" or message.role == \"tool_results\" or (message.tool_calls is defined and message.tool_calls is not none)) %}  \n        {%- if (message[\"role\"] == \"user\") != (ns.index % 2 == 0) %}  \n            {{- raise_exception(\"After the optional system message, conversation roles must alternate user/assistant/user/assistant...\") }}  \n        {%- endif %}  \n        {%- set ns.index = ns.index + 1 %}  \n    {%- endif %}  \n{%- endfor %}\n{{- bos_token }}  \n{%- for message in loop_messages %}  \n    {%- if message[\"role\"] == \"user\" %}  \n        {%- if tools is not none and (message == user_messages[-1]) %}  \n            {{- \"[AVAILABLE_TOOLS] [\" }}  \n            {%- for tool in tools %}  \n                {%- set tool = tool.function %}  \n                {{- '{\"type\": \"function\", \"function\": {' }}  \n                {%- for key, val in tool.items() if key != \"return\" %}  \n                    {%- if val is string %}  \n                        {{- '\"' + key + '\": \"' + val + '\"' }}  \n                    {%- else %}  \n                        {{- '\"' + key + '\": ' + val|tojson }}  \n                    {%- endif %}  \n                    {%- if not loop.last %}  \n                        {{- \", \" }}  \n                    {%- endif %}  \n                {%- endfor %}  \n                {{- \"}}\" }}  \n                {%- if not loop.last %}  \n                    {{- \", \" }}  \n                {%- else %}  \n                    {{- \"]\" }}  \n                {%- endif %}  \n            {%- endfor %}  \n            {{- \"[/AVAILABLE_TOOLS]\" }}  \n        {%- endif %}  \n        {%- if  system_message is defined %}  \n            {{- \"[INST]\" + system_message + \"\\n\" + message[\"content\"] + \"[/INST]\\n\" }}  \n        {%- else %}  \n            {{- \"[INST]\" + message[\"content\"] + \"[/INST]\\n\" }}  \n        {%- endif %}  \n    {%- elif message.tool_calls is defined and message.tool_calls is not none %}  \n        {{- \"[TOOL_CALLS] [\" }}  \n        {%- for tool_call in message.tool_calls %}  \n            {%- set out = tool_call.function|tojson %}  \n            {{- out[:-1] }}  \n            {%- if not tool_call.id is defined or tool_call.id|length != 9 %}  \n                {{- raise_exception(\"Tool call IDs should be alphanumeric strings with length 9!\") }}  \n            {%- endif %}  \n            {{- ', \"id\": \"' + tool_call.id + '\"}' }}  \n            {%- if not loop.last %}  \n                {{- \", \" }}  \n            {%- else %}  \n                {{- \"]\" + eos_token }}  \n            {%- endif %}  \n        {%- endfor %}  \n    {%- elif message[\"role\"] == \"assistant\" %}  \n        {{- \"\" + message[\"content\"]|trim + eos_token + \"\\n\"}}  \n    {%- elif message[\"role\"] == \"tool_results\" or message[\"role\"] == \"tool\" %}  \n        {%- if message.content is defined and message.content.content is defined %}  \n            {%- set content = message.content.content %}  \n        {%- else %}  \n            {%- set content = message.content %}  \n        {%- endif %}  \n        {{- '[TOOL_RESULTS] {\"content\": ' + content|string + \", 'call_id': '\" + message.tool_call_id + \"'}[/TOOL_RESULTS]\" }}  \n    {%- else %}  \n        {{- raise_exception(\"Only user and assistant roles are supported, with the exception of an initial optional system message!\") }}  \n    {%- endif %}  \n{%- endfor %}",
   "clean_up_tokenization_spaces": false,
   "eos_token": "</s>",
   "extra_special_tokens": {},