Instructions to use FormatC/Qwen3-4B-DND with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use FormatC/Qwen3-4B-DND with Transformers:

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("FormatC/Qwen3-4B-DND", dtype="auto")

llama-cpp-python

How to use FormatC/Qwen3-4B-DND with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="FormatC/Qwen3-4B-DND",
	filename="Qwen3-4B-DND-F16.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use FormatC/Qwen3-4B-DND with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf FormatC/Qwen3-4B-DND:F16
# Run inference directly in the terminal:
llama-cli -hf FormatC/Qwen3-4B-DND:F16

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf FormatC/Qwen3-4B-DND:F16
# Run inference directly in the terminal:
llama-cli -hf FormatC/Qwen3-4B-DND:F16

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf FormatC/Qwen3-4B-DND:F16
# Run inference directly in the terminal:
./llama-cli -hf FormatC/Qwen3-4B-DND:F16

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf FormatC/Qwen3-4B-DND:F16
# Run inference directly in the terminal:
./build/bin/llama-cli -hf FormatC/Qwen3-4B-DND:F16

Use Docker

docker model run hf.co/FormatC/Qwen3-4B-DND:F16

LM Studio
Jan
Ollama
How to use FormatC/Qwen3-4B-DND with Ollama:
```
ollama run hf.co/FormatC/Qwen3-4B-DND:F16
```

Unsloth Studio new

How to use FormatC/Qwen3-4B-DND with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for FormatC/Qwen3-4B-DND to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for FormatC/Qwen3-4B-DND to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for FormatC/Qwen3-4B-DND to start chatting

Pi new

How to use FormatC/Qwen3-4B-DND with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf FormatC/Qwen3-4B-DND:F16

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "FormatC/Qwen3-4B-DND:F16"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use FormatC/Qwen3-4B-DND with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf FormatC/Qwen3-4B-DND:F16

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default FormatC/Qwen3-4B-DND:F16

Run Hermes

hermes

Docker Model Runner
How to use FormatC/Qwen3-4B-DND with Docker Model Runner:
```
docker model run hf.co/FormatC/Qwen3-4B-DND:F16
```

Lemonade

How to use FormatC/Qwen3-4B-DND with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull FormatC/Qwen3-4B-DND:F16

Run and chat with the model

lemonade run user.Qwen3-4B-DND-F16

List all available models

lemonade list

Qwen3-4B-DND Engine Narrator: Fun times with AI-generated DND narration.

Engine Narrator is a specialized LoRA fine-tuned on the Qwen3-4B architecture using the lara-martin/FIREBALL dataset. It is designed specifically for Dark Fantasy TTRPGs (specifically the Grim Hollow setting), focusing on visceral, grounded prose and strict adherence to game mechanics. Qwen3-4B-DND Engine Narrator was Built with Qwen. This model is intended for non-commercial research and roleplay purposes only.

User Guide: Operating the Engine Narrator

1. The Core System Prompt

This prompt must be active at all times during the session. It acts as the "Instruction Anchor" that the LoRA was trained to follow. Without it, the model may ignore the JSON mechanics.

Copy-Paste into System Field:

You are the [Engine_Narrator]. Your purpose is to provide grounded, visceral, and logically consistent RPG narration based on provided JSON input.

### MANDATORY NARRATIVE LAWS:

1. LOGIC & PHYSICS ADHERENCE:
   - If [Success] is False: The target must remain UNMOVED and UNDAMAGED. The energy of the action must recoil into the character's body (e.g., jarring bones, sliding feet, bruised muscle). Do not describe the target buckling or yielding.
   - If [Success] is True: The target yields as intended, emphasizing the character's competence.

2. SCALE & HYPERBOLE CONTROL:
   - Match the intensity of the prose to the mechanic. 
   - A failed strength check or a minor hazard causes physical strain or temporary pain (e.g., stinging skin, labored breath), NOT permanent mutilation or magical disintegration unless "Curse" or "Magic" is explicitly in the JSON.

3. SENSORY STACKING:
   - Every response MUST include exactly one Auditory detail (e.g., the screech of metal, a hollow thud) AND one Tactile detail (e.g., the bite of cold iron, the stinging heat of rust).

4. ANTI-ECHO PROTOCOL:
   - You are strictly forbidden from reusing unique verbs or nouns from the [Action] field. Transform "I throw my weight" into "Driving a shoulder into the unresponsive barrier."

5. TONE & DELIVERY:
   - Use a low-fantasy, grounded, and visceral tone.
   - Focus on clinical descriptions of impact and sensory feedback over poetic metaphors.
   - NO META-TALK: Never address the user or offer advice. Start the narrative immediately.

2. Standard Input Format

To get the "Chain of Thought" reasoning and high-quality prose, the model requires information delivered in a specific Triple-Block format.

Template:

[Context]: {"location": "Area Name", "target": "Object/NPC"}
[Mechanics]: {"success": true/false, "difficulty": #, "hazard": "optional"}
[Action]: Your character's specific physical attempt.

Why this matters: The model is trained to look for these headers.

[Context] sets the sensory palette (stone, iron, forest).
[Mechanics] dictates the "branch" of the story (Success vs. Failure).
[Action] provides the "verbs" the model must avoid repeating.

3. Recommended Inference Parameters

These settings control the "randomness" and "flow" of the model. Using the wrong settings can result in repetitive loops or overly chaotic text.

Parameter	Value	Purpose
Temperature	0.8	Allows for creative prose without losing logical coherence.
Min-P	0.05	CRITICAL. Filters out nonsensical words while allowing for high-quality synonyms.
Top-P	1.0	(Disabled/Set to Max) Let Min-P handle the filtering.
Repetition Penalty	1.15	Prevents the model from overusing favorite words like "molten" or "visceral."
Max Tokens	150 - 250	Keeps the narrative punchy and prevents "rambling."

4. Gameplay Logic: What to Expect

Users should be aware of the "Physics Lock" we programmed into the model to ensure a high-quality "Grim-Hollow" feel:

The Recoil Rule: If a user provides a {"success": false}, the model will never describe the door breaking or the enemy flinching. Instead, it will describe the jarring impact on the player's body.
Sensory Requirement: Every response will include exactly one sound and one physical sensation. If the user doesn't see these, they should check if the System Prompt is properly loaded.
Anti-Hyperbole: The model is trained to be grounded. If a player fails to pick a lock, the model will describe the "click of a jammed pin" or "cramping fingers," not the lock exploding or the player's hand falling off.

Troubleshooting for Users

Problem: The model is repeating my words.
Fix: Increase Repetition Penalty to 1.2 and ensure the Anti-Echo Law is in the System Prompt.
Problem: The model says I succeeded when the JSON says False.
Fix: Ensure the [Mechanics] block is clearly separated by a new line. The model needs to "see" the failure flag clearly to trigger its recoil logic.

Example prompts

Here is a "Quick Start" cheat sheet for testing the Engine Narrator. These examples are specifically designed to test the boundaries of the logic we've trained: Physics Recoil, Hazard Integration, and Grounded Scale.

🧪 The Engine Narrator Test Suite

Copy and paste these into the User input field to verify the LoRA is functioning correctly across different RPG scenarios.

1. The "Physics Recoil" Test (Strength/Failure)

Goal: Verify the gate doesn't move and the energy "bounces back" into the character.

[Context]: {"location": "The Iron Oubliette", "target": "Heavy Cell Door"}
[Mechanics]: {"success": false, "difficulty": 15}
[Action]: I deliver a powerful kick to the door's center.

2. The "Environmental Hazard" Test (Dexterity/Failure)

Goal: Ensure the hazard (acid) is described viscerally without being world-ending.

[Context]: {"location": "Alchemist's Drain", "target": "Stone Ledge"}
[Mechanics]: {"success": false, "hazard": "acid-splash"}
[Action]: I try to leap across the gap to the far ledge.

3. The "Social Grit" Test (Charisma/Success)

Goal: Verify the model can handle non-combat tension and grounded social feedback.

[Context]: {"location": "The Gallow's Inn", "target": "Nervous Informant"}
[Mechanics]: {"success": true, "intimidation": 18}
[Action]: I slam my dagger into the table an inch from his hand.

4. The "Fine Motor" Test (Dexterity/Success)

Goal: Check if the model can describe precise sensory stacking (sound and touch) in a quiet scene.

[Context]: {"location": "The Vault Room", "target": "Brass Lockbox"}
[Mechanics]: {"success": true, "tools": "lockpicks"}
[Action]: I gently probe the tumblers with my tension wrench.

5. The "Medical Grit" Test (Intelligence/Success)

Goal: Test the "Scale Control" to ensure healing is described as gritty and physical, not magical.

[Context]: {"location": "Battlefield Medic Tent", "target": "Open Wound"}
[Mechanics]: {"success": true, "medicine": 12}
[Action]: I apply a cauterizing iron to the jagged gash.

How to Grade the Results

When you run these, use this quick checklist. If the answer to any of these is "No," your Inference Parameters (like Min-P or Repetition Penalty) need adjustment:

Logic: Did a "Success: False" result in an unmoved object?
Sensory: Is there one sound and one touch?
Anti-Echo: Did the model avoid using the verbs you used in the [Action]?
Scale: Did it avoid "fingers turning to ash" or other magical hyperbole?

Citation

If you use this model in your research, please cite the original FIREBALL dataset:

@inproceedings{zhu-etal-2023-fireball,
    title = "{FIREBALL}: A Dataset of Dungeons and Dragons Actual-Play with Structured Game State Information",
    author = "Zhu, Andrew and Aggarwal, Karmanya and Feng, Alexander and Martin, Lara J. and Callison-Burch, Chris",
    booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics",
    year = "2023",
    url = "[https://aclanthology.org/2023.acl-long.229](https://aclanthology.org/2023.acl-long.229)",
}

Downloads last month: 25

GGUF

Model size

4B params

Architecture

qwen3

Hardware compatibility

8-bit

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for FormatC/Qwen3-4B-DND

Base model

Qwen/Qwen3-4B-Base

Finetuned

Qwen/Qwen3-4B

Quantized

(221)

this model

FormatC
/

Qwen3-4B-DND