Instructions to use rkevan/mud-judgment with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use rkevan/mud-judgment with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="rkevan/mud-judgment")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("rkevan/mud-judgment", dtype="auto")

llama-cpp-python

How to use rkevan/mud-judgment with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="rkevan/mud-judgment",
	filename="mud-judgment-q4km.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use rkevan/mud-judgment with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf rkevan/mud-judgment
# Run inference directly in the terminal:
llama-cli -hf rkevan/mud-judgment

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf rkevan/mud-judgment
# Run inference directly in the terminal:
llama-cli -hf rkevan/mud-judgment

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf rkevan/mud-judgment
# Run inference directly in the terminal:
./llama-cli -hf rkevan/mud-judgment

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf rkevan/mud-judgment
# Run inference directly in the terminal:
./build/bin/llama-cli -hf rkevan/mud-judgment

Use Docker

docker model run hf.co/rkevan/mud-judgment

LM Studio
Jan

vLLM

How to use rkevan/mud-judgment with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "rkevan/mud-judgment"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "rkevan/mud-judgment",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/rkevan/mud-judgment

SGLang

How to use rkevan/mud-judgment with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "rkevan/mud-judgment" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "rkevan/mud-judgment",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "rkevan/mud-judgment" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "rkevan/mud-judgment",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Ollama
How to use rkevan/mud-judgment with Ollama:
```
ollama run hf.co/rkevan/mud-judgment
```

Unsloth Studio new

How to use rkevan/mud-judgment with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for rkevan/mud-judgment to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for rkevan/mud-judgment to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for rkevan/mud-judgment to start chatting

Pi new

How to use rkevan/mud-judgment with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf rkevan/mud-judgment

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "rkevan/mud-judgment"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use rkevan/mud-judgment with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf rkevan/mud-judgment

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default rkevan/mud-judgment

Run Hermes

hermes

Docker Model Runner
How to use rkevan/mud-judgment with Docker Model Runner:
```
docker model run hf.co/rkevan/mud-judgment
```

Lemonade

How to use rkevan/mud-judgment with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull rkevan/mud-judgment

Run and chat with the model

lemonade run user.mud-judgment-{{QUANT_TAG}}

List all available models

lemonade list

mud-judgment / README.md

rkevan

Upload README.md with huggingface_hub

b60e902 verified 9 days ago

preview code

raw

history blame contribute delete

6.04 kB

	---
	base_model: meta-llama/Llama-3.2-3B-Instruct
	license: llama3.2
	language:
	- en
	library_name: transformers
	tags:
	- llama
	- gguf
	- mud
	- game-ai
	- decision-making
	- fine-tuned
	- unsloth
	- trl
	- sft
	model_name: mud-judgment
	pipeline_tag: text-generation
	quantized_by: llama.cpp
	---

	# mud-judgment — MUD Game Decision Engine (GGUF)

	A fine-tuned Llama 3.2 3B Instruct model that makes real-time judgment calls for a bot playing [Apocalypse VI: Reborn](http://apocalypse-vi.com), a CircleMUD text game. The model handles decisions that scripted logic cannot: flee or fight, which path to take, whether to enter a dangerous area.

	## Model Details

	\| Property \| Value \|
	\|----------\|-------\|
	\| Base model \| `meta-llama/Llama-3.2-3B-Instruct` \|
	\| Fine-tuning method \| QLoRA via Unsloth (rank=16, alpha=32) \|
	\| Training framework \| TRL SFTTrainer, completion-only loss \|
	\| Training data \| ~594 hand-crafted JSONL examples across 4 decision categories \|
	\| Quantization \| Q4_K_M (1.9 GB) and Q8_0 (3.2 GB) via llama.cpp \|
	\| VRAM requirement \| ~3 GB (Q4_K_M), ~4.5 GB (Q8_0) \|
	\| Output format \| Single command + one-line reasoning \|

	## Files

	\| File \| Size \| Description \|
	\|------\|------\|-------------\|
	\| `mud-judgment-q4km.gguf` \| 1.9 GB \| Q4_K_M quantization (recommended for ≤6 GB VRAM) \|
	\| `mud-judgment-q8.gguf` \| 3.2 GB \| Q8_0 quantization (higher quality, needs ~5 GB VRAM) \|
	\| `Modelfile` \| — \| Ollama Modelfile with Llama 3.2 chat template \|
	\| `system_prompt.txt` \| — \| Required system prompt (must be included in every call) \|

	## Quick Start — Ollama

	```bash
	# Download the GGUF and Modelfile, then:
	ollama create mud-judgment -f Modelfile

	# Call via API (system prompt is required):
	curl -s http://localhost:11434/api/chat -d '{
	"model": "mud-judgment",
	"stream": false,
	"messages": [
	{"role": "system", "content": "<contents of system_prompt.txt>"},
	{"role": "user", "content": "[SITUATION]\nDecision: COMBAT \| Trigger: HP critical \| State: 28hp 100mn 35mv \| Level 7 \| Buffs: none\n[/SITUATION]\n\nA forest wraith slashes YOU extremely hard.\nThat really did HURT!\nYour blood freezes as you hear a wraith'\''s death shriek."}
	]
	}'
	```

	Expected response:
	```
	flee
	> HP critical at 28, wraith hitting extremely hard — cannot sustain this fight
	```

	## Quick Start — llama.cpp / Python

	```bash
	# llama.cpp CLI
	llama-cli -m mud-judgment-q4km.gguf --temp 0.3 --top-p 0.9 \
	-p "<\|start_header_id\|>system<\|end_header_id\|>\n\n<system prompt><\|eot_id\|><\|start_header_id\|>user<\|end_header_id\|>\n\n<situation><\|eot_id\|><\|start_header_id\|>assistant<\|end_header_id\|>\n\n"
	```

	```python
	# Python with llama-cpp-python
	from llama_cpp import Llama

	llm = Llama(model_path="mud-judgment-q4km.gguf", n_ctx=2048, n_gpu_layers=-1)
	response = llm.create_chat_completion(
	messages=[
	{"role": "system", "content": open("system_prompt.txt").read()},
	{"role": "user", "content": situation_text},
	],
	temperature=0.3,
	top_p=0.9,
	)
	print(response["choices"][0]["message"]["content"])
	```

	## Decision Types

	The model handles 4 categories of judgment call:

	\| Type \| When Called \| Example Commands \|
	\|------\|------------\|-----------------\|
	\| COMBAT \| HP critical, losing fight, buffs expired \| `flee`, `recall`, `rebuff` \|
	\| NAVIGATION \| Stuck, maze, forced movement, no exits \| `north`, `extract`, `maze`, `forced` \|
	\| RISK \| Unexplored exit, dangerous mob, death room \| `continue`, `avoid`, `unavailable`, `hostile` \|
	\| RECOVERY \| Post-death, stuck, resource depletion \| `urgent`, `rebuff`, `abandon`, `extract` \|

	## Input Format

	Every user message must contain a `[SITUATION]` block:

	```
	[SITUATION]
	Decision: RISK \| Trigger: Unexplored exit \| State: 94hp 177mn 68mv \| Level 5 \| Buffs: invis, sanc
	[/SITUATION]

	Standing at the edge of a deep crevasse...
	One false step and you'd plunge into the darkness below.
	There appears to be no chance of surviving the deadly fall.
	[EXITS: North East Down]
	```

	## Output Format

	Exactly two lines:
	1. A single command (game command or script command)
	2. A reasoning line prefixed with `>`

	```
	avoid
	> Death room — crevasse with "no chance of surviving" language, flagging for safe exploration later
	```

	## Important Usage Notes

	- System prompt is mandatory. The model was trained with the system prompt in every example. Without it, output quality degrades significantly.
	- Temperature 0.3 is recommended. Higher temperatures produce inconsistent formatting.
	- Do not use `ollama run` without setting the system prompt first (`/set system <prompt>`). Use the chat API instead.
	- Modelfile must include the full Llama 3.2 chat template — see the included `Modelfile` for the correct template.

	## Training Details

	- Method: QLoRA with Unsloth on WSL2 Ubuntu 24.04
	- GPU: NVIDIA RTX 1000 Ada (6 GB VRAM) — training fits in ~4 GB
	- Epochs: 2 (with 594 examples)
	- Learning rate: 5e-5 with cosine scheduler
	- Effective batch size: 8 (batch=1, grad_accum=8)
	- Eval loss: 1.86 (steadily declining, no overfitting)
	- Loss type: Completion-only (only trains on assistant response tokens)
	- LoRA targets: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

	## Limitations

	- Trained specifically for Apocalypse VI: Reborn game mechanics. May not generalize to other MUDs without additional training data.
	- The 594-example training set covers common scenarios well but edge cases (ITEM, UNEXPECTED types) have minimal coverage.
	- Quantization to Q4_K_M introduces slight quality loss vs. the full-precision LoRA adapter.

	## Source Code

	Training scripts, data generation, and the crawler that consumes this model are at:
	[github.com/ninjarob/Apocalypse-VI-Projects](https://github.com/ninjarob/Apocalypse-VI-Projects)

	## Citation

	```bibtex
	@misc{mud-judgment-2026,
	title={mud-judgment: Fine-tuned Llama 3.2 3B for MUD Game Decision Making},
	author={Robert Kevan},
	year={2026},
	url={https://huggingface.co/rkevan/mud-judgment}
	}
	```