Instructions to use rkevan/mud-judgment with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use rkevan/mud-judgment with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="rkevan/mud-judgment")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("rkevan/mud-judgment", dtype="auto")

llama-cpp-python

How to use rkevan/mud-judgment with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="rkevan/mud-judgment",
	filename="mud-judgment-q4km.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use rkevan/mud-judgment with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf rkevan/mud-judgment
# Run inference directly in the terminal:
llama-cli -hf rkevan/mud-judgment

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf rkevan/mud-judgment
# Run inference directly in the terminal:
llama-cli -hf rkevan/mud-judgment

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf rkevan/mud-judgment
# Run inference directly in the terminal:
./llama-cli -hf rkevan/mud-judgment

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf rkevan/mud-judgment
# Run inference directly in the terminal:
./build/bin/llama-cli -hf rkevan/mud-judgment

Use Docker

docker model run hf.co/rkevan/mud-judgment

LM Studio
Jan

vLLM

How to use rkevan/mud-judgment with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "rkevan/mud-judgment"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "rkevan/mud-judgment",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/rkevan/mud-judgment

SGLang

How to use rkevan/mud-judgment with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "rkevan/mud-judgment" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "rkevan/mud-judgment",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "rkevan/mud-judgment" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "rkevan/mud-judgment",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Ollama
How to use rkevan/mud-judgment with Ollama:
```
ollama run hf.co/rkevan/mud-judgment
```

Unsloth Studio new

How to use rkevan/mud-judgment with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for rkevan/mud-judgment to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for rkevan/mud-judgment to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for rkevan/mud-judgment to start chatting

Pi new

How to use rkevan/mud-judgment with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf rkevan/mud-judgment

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "rkevan/mud-judgment"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use rkevan/mud-judgment with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf rkevan/mud-judgment

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default rkevan/mud-judgment

Run Hermes

hermes

Docker Model Runner
How to use rkevan/mud-judgment with Docker Model Runner:
```
docker model run hf.co/rkevan/mud-judgment
```

Lemonade

How to use rkevan/mud-judgment with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull rkevan/mud-judgment

Run and chat with the model

lemonade run user.mud-judgment-{{QUANT_TAG}}

List all available models

lemonade list

mud-judgment — MUD Game Decision Engine (GGUF)

A fine-tuned Llama 3.2 3B Instruct model that makes real-time judgment calls for a bot playing Apocalypse VI: Reborn, a CircleMUD text game. The model handles decisions that scripted logic cannot: flee or fight, which path to take, whether to enter a dangerous area.

Model Details

Property	Value
Base model	`meta-llama/Llama-3.2-3B-Instruct`
Fine-tuning method	QLoRA via Unsloth (rank=16, alpha=32)
Training framework	TRL SFTTrainer, completion-only loss
Training data	~594 hand-crafted JSONL examples across 4 decision categories
Quantization	Q4_K_M (1.9 GB) and Q8_0 (3.2 GB) via llama.cpp
VRAM requirement	~3 GB (Q4_K_M), ~4.5 GB (Q8_0)
Output format	Single command + one-line reasoning

Files

File	Size	Description
`mud-judgment-q4km.gguf`	1.9 GB	Q4_K_M quantization (recommended for ≤6 GB VRAM)
`mud-judgment-q8.gguf`	3.2 GB	Q8_0 quantization (higher quality, needs ~5 GB VRAM)
`Modelfile`	—	Ollama Modelfile with Llama 3.2 chat template
`system_prompt.txt`	—	Required system prompt (must be included in every call)

Quick Start — Ollama

# Download the GGUF and Modelfile, then:
ollama create mud-judgment -f Modelfile

# Call via API (system prompt is required):
curl -s http://localhost:11434/api/chat -d '{
  "model": "mud-judgment",
  "stream": false,
  "messages": [
    {"role": "system", "content": "<contents of system_prompt.txt>"},
    {"role": "user", "content": "[SITUATION]\nDecision: COMBAT | Trigger: HP critical | State: 28hp 100mn 35mv | Level 7 | Buffs: none\n[/SITUATION]\n\nA forest wraith slashes YOU extremely hard.\nThat really did HURT!\nYour blood freezes as you hear a wraith'\''s death shriek."}
  ]
}'

Expected response:

flee
> HP critical at 28, wraith hitting extremely hard — cannot sustain this fight

Quick Start — llama.cpp / Python

# llama.cpp CLI
llama-cli -m mud-judgment-q4km.gguf --temp 0.3 --top-p 0.9 \
  -p "<|start_header_id|>system<|end_header_id|>\n\n<system prompt><|eot_id|><|start_header_id|>user<|end_header_id|>\n\n<situation><|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"

# Python with llama-cpp-python
from llama_cpp import Llama

llm = Llama(model_path="mud-judgment-q4km.gguf", n_ctx=2048, n_gpu_layers=-1)
response = llm.create_chat_completion(
    messages=[
        {"role": "system", "content": open("system_prompt.txt").read()},
        {"role": "user", "content": situation_text},
    ],
    temperature=0.3,
    top_p=0.9,
)
print(response["choices"][0]["message"]["content"])

Decision Types

The model handles 4 categories of judgment call:

Type	When Called	Example Commands
COMBAT	HP critical, losing fight, buffs expired	`flee`, `recall`, `rebuff`
NAVIGATION	Stuck, maze, forced movement, no exits	`north`, `extract`, `maze`, `forced`
RISK	Unexplored exit, dangerous mob, death room	`continue`, `avoid`, `unavailable`, `hostile`
RECOVERY	Post-death, stuck, resource depletion	`urgent`, `rebuff`, `abandon`, `extract`

Input Format

Every user message must contain a [SITUATION] block:

[SITUATION]
Decision: RISK | Trigger: Unexplored exit | State: 94hp 177mn 68mv | Level 5 | Buffs: invis, sanc
[/SITUATION]

Standing at the edge of a deep crevasse...
One false step and you'd plunge into the darkness below.
There appears to be no chance of surviving the deadly fall.
[EXITS: North East *Down*]

Output Format

Exactly two lines:

A single command (game command or script command)
A reasoning line prefixed with >

avoid
> Death room — crevasse with "no chance of surviving" language, flagging for safe exploration later

Important Usage Notes

System prompt is mandatory. The model was trained with the system prompt in every example. Without it, output quality degrades significantly.
Temperature 0.3 is recommended. Higher temperatures produce inconsistent formatting.
Do not use ollama run without setting the system prompt first (/set system <prompt>). Use the chat API instead.
Modelfile must include the full Llama 3.2 chat template — see the included Modelfile for the correct template.

Training Details

Method: QLoRA with Unsloth on WSL2 Ubuntu 24.04
GPU: NVIDIA RTX 1000 Ada (6 GB VRAM) — training fits in ~4 GB
Epochs: 2 (with 594 examples)
Learning rate: 5e-5 with cosine scheduler
Effective batch size: 8 (batch=1, grad_accum=8)
Eval loss: 1.86 (steadily declining, no overfitting)
Loss type: Completion-only (only trains on assistant response tokens)
LoRA targets: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

Limitations

Trained specifically for Apocalypse VI: Reborn game mechanics. May not generalize to other MUDs without additional training data.
The 594-example training set covers common scenarios well but edge cases (ITEM, UNEXPECTED types) have minimal coverage.
Quantization to Q4_K_M introduces slight quality loss vs. the full-precision LoRA adapter.

Source Code

Training scripts, data generation, and the crawler that consumes this model are at: github.com/ninjarob/Apocalypse-VI-Projects

Citation

@misc{mud-judgment-2026,
  title={mud-judgment: Fine-tuned Llama 3.2 3B for MUD Game Decision Making},
  author={Robert Kevan},
  year={2026},
  url={https://huggingface.co/rkevan/mud-judgment}
}

Downloads last month: 258

GGUF

Model size

3B params

Architecture

llama

Hardware compatibility

We're not able to determine the quantization variants.

View all variants

Model tree for rkevan/mud-judgment

Base model

meta-llama/Llama-3.2-3B-Instruct

Quantized

(453)

this model