Instructions to use limloop/MN-12B-Hydra-RP-RU with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use limloop/MN-12B-Hydra-RP-RU with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="limloop/MN-12B-Hydra-RP-RU")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("limloop/MN-12B-Hydra-RP-RU")
model = AutoModelForCausalLM.from_pretrained("limloop/MN-12B-Hydra-RP-RU")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use limloop/MN-12B-Hydra-RP-RU with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "limloop/MN-12B-Hydra-RP-RU"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "limloop/MN-12B-Hydra-RP-RU",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/limloop/MN-12B-Hydra-RP-RU

SGLang

How to use limloop/MN-12B-Hydra-RP-RU with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "limloop/MN-12B-Hydra-RP-RU" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "limloop/MN-12B-Hydra-RP-RU",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "limloop/MN-12B-Hydra-RP-RU" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "limloop/MN-12B-Hydra-RP-RU",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use limloop/MN-12B-Hydra-RP-RU with Docker Model Runner:
```
docker model run hf.co/limloop/MN-12B-Hydra-RP-RU
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

MN-12B-Hydra-RP-RU

🇷🇺 Нажмите, чтобы развернуть описание на русском

🌟 О модели

MN-12B-Hydra-RP-RU — экспериментальный merge на базе Mistral Nemo 12B, сочетающий:

🎭 Сильные ролевые способности
📚 Глубокий литературный русский язык
🔓 Снятую цензуру

Модель собрана методом TIES-merging, что позволяет объединять веса нескольких моделей с минимальными конфликтами между параметрами.

🎯 Особенности

Основной язык — русский
Хорошо держит персонажей и контекст
Следует инструкциям
Сохраняет возможности базового Nemo
Не проходила дополнительного обучения после слияния

⚠️ Важно

Uncensored-характер модели означает, что она может генерировать контент, который некоторые пользователи сочтут неподобающим.

High-quality TIES merge based on Mistral Nemo 12B, optimized for roleplay, strong Russian language capabilities, and uncensored behavior.

🌍 Overview

MN-12B-Hydra-RP-RU is an experimental merge built on top of Mistral Nemo 12B, combining strengths from multiple fine-tuned models:

🎭 Advanced roleplay capability from Pathfinder-RP
📚 Deep Russian language fluency inspired by Vikhr + Dostoevsky-style tuning
🔓 Reduced safety filtering via uncensored components

The merge was created using TIES merging, which allows combining model deltas while minimizing destructive interference between weights.

🎯 Key Features

Feature	Description
Languages	Russian, English
Censorship	Uncensored behavior
Roleplay	Strong character consistency and narrative depth
Instruction Following	Reliable prompt adherence
Tool Calling	Retains base Nemo capabilities
Architecture	Mistral Nemo 12B

🧩 Model Composition

The merge combines the following models:

Model	Role in merge	Weight
Pathfinder-RP-12B-RU	Base model, RP backbone	0.60
Vikhr Nemo ORPO Dostoevsky	Literary Russian depth	0.25
HERETIC Uncensored	Safety removal	0.30
Mag-Mell R1 Uncensored	Additional uncensor delta	0.20

Weights shown before normalization (final weights are normalized to sum = 1).

💡 Usage Example

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_name = "limloop/MN-12B-Hydra-RP-RU"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

prompt = "You are a medieval innkeeper. Greet the traveler!"
messages = [{"role": "user", "content": prompt}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)

outputs = model.generate(inputs, max_new_tokens=512, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

⚙️ Merge Details

Built using mergekit with the TIES method (Trim, Elect Sign, Merge).

Core mechanism:

Trim low-magnitude deltas via density
Resolve sign conflicts
Weighted averaging of aligned parameters

Merge Configuration

models:
  - model: Aleteian/Pathfinder-RP-12B-RU
    weight: 0.6
  - model: IlyaGusev/vikhr_nemo_orpo_dostoevsky_12b_slerp
    weight: 0.25
    density: 0.9
  - model: DavidAU/Mistral-Nemo-2407-12B-Thinking-Claude-Gemini-GPT5.2-Uncensored-HERETIC
    weight: 0.3
    density: 0.9
  - model: Naphula/MN-12B-Mag-Mell-R1-Uncensored
    weight: 0.2
    density: 0.9

merge_method: ties
parameters:
  epsilon: 0.01
  normalize: true
base_model: Aleteian/Pathfinder-RP-12B-RU
dtype: bfloat16
tokenizer:
  source: base