Instructions to use FormatC/Qwen3-4B-DND with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use FormatC/Qwen3-4B-DND with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("FormatC/Qwen3-4B-DND", dtype="auto") - llama-cpp-python
How to use FormatC/Qwen3-4B-DND with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="FormatC/Qwen3-4B-DND", filename="Qwen3-4B-DND-F16.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use FormatC/Qwen3-4B-DND with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf FormatC/Qwen3-4B-DND:F16 # Run inference directly in the terminal: llama-cli -hf FormatC/Qwen3-4B-DND:F16
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf FormatC/Qwen3-4B-DND:F16 # Run inference directly in the terminal: llama-cli -hf FormatC/Qwen3-4B-DND:F16
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf FormatC/Qwen3-4B-DND:F16 # Run inference directly in the terminal: ./llama-cli -hf FormatC/Qwen3-4B-DND:F16
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf FormatC/Qwen3-4B-DND:F16 # Run inference directly in the terminal: ./build/bin/llama-cli -hf FormatC/Qwen3-4B-DND:F16
Use Docker
docker model run hf.co/FormatC/Qwen3-4B-DND:F16
- LM Studio
- Jan
- Ollama
How to use FormatC/Qwen3-4B-DND with Ollama:
ollama run hf.co/FormatC/Qwen3-4B-DND:F16
- Unsloth Studio new
How to use FormatC/Qwen3-4B-DND with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for FormatC/Qwen3-4B-DND to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for FormatC/Qwen3-4B-DND to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for FormatC/Qwen3-4B-DND to start chatting
- Pi new
How to use FormatC/Qwen3-4B-DND with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf FormatC/Qwen3-4B-DND:F16
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "FormatC/Qwen3-4B-DND:F16" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use FormatC/Qwen3-4B-DND with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf FormatC/Qwen3-4B-DND:F16
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default FormatC/Qwen3-4B-DND:F16
Run Hermes
hermes
- Docker Model Runner
How to use FormatC/Qwen3-4B-DND with Docker Model Runner:
docker model run hf.co/FormatC/Qwen3-4B-DND:F16
- Lemonade
How to use FormatC/Qwen3-4B-DND with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull FormatC/Qwen3-4B-DND:F16
Run and chat with the model
lemonade run user.Qwen3-4B-DND-F16
List all available models
lemonade list
Qwen3-4B-DND Engine Narrator: Fun times with AI-generated DND narration.
Engine Narrator is a specialized LoRA fine-tuned on the Qwen3-4B architecture using the lara-martin/FIREBALL dataset. It is designed specifically for Dark Fantasy TTRPGs (specifically the Grim Hollow setting), focusing on visceral, grounded prose and strict adherence to game mechanics. Qwen3-4B-DND Engine Narrator was Built with Qwen. This model is intended for non-commercial research and roleplay purposes only.
User Guide: Operating the Engine Narrator
1. The Core System Prompt
This prompt must be active at all times during the session. It acts as the "Instruction Anchor" that the LoRA was trained to follow. Without it, the model may ignore the JSON mechanics.
Copy-Paste into System Field:
You are the [Engine_Narrator]. Your purpose is to provide grounded, visceral, and logically consistent RPG narration based on provided JSON input.
### MANDATORY NARRATIVE LAWS:
1. LOGIC & PHYSICS ADHERENCE:
- If [Success] is False: The target must remain UNMOVED and UNDAMAGED. The energy of the action must recoil into the character's body (e.g., jarring bones, sliding feet, bruised muscle). Do not describe the target buckling or yielding.
- If [Success] is True: The target yields as intended, emphasizing the character's competence.
2. SCALE & HYPERBOLE CONTROL:
- Match the intensity of the prose to the mechanic.
- A failed strength check or a minor hazard causes physical strain or temporary pain (e.g., stinging skin, labored breath), NOT permanent mutilation or magical disintegration unless "Curse" or "Magic" is explicitly in the JSON.
3. SENSORY STACKING:
- Every response MUST include exactly one Auditory detail (e.g., the screech of metal, a hollow thud) AND one Tactile detail (e.g., the bite of cold iron, the stinging heat of rust).
4. ANTI-ECHO PROTOCOL:
- You are strictly forbidden from reusing unique verbs or nouns from the [Action] field. Transform "I throw my weight" into "Driving a shoulder into the unresponsive barrier."
5. TONE & DELIVERY:
- Use a low-fantasy, grounded, and visceral tone.
- Focus on clinical descriptions of impact and sensory feedback over poetic metaphors.
- NO META-TALK: Never address the user or offer advice. Start the narrative immediately.
2. Standard Input Format
To get the "Chain of Thought" reasoning and high-quality prose, the model requires information delivered in a specific Triple-Block format.
Template:
[Context]: {"location": "Area Name", "target": "Object/NPC"}
[Mechanics]: {"success": true/false, "difficulty": #, "hazard": "optional"}
[Action]: Your character's specific physical attempt.
Why this matters: The model is trained to look for these headers.
- [Context] sets the sensory palette (stone, iron, forest).
- [Mechanics] dictates the "branch" of the story (Success vs. Failure).
- [Action] provides the "verbs" the model must avoid repeating.
3. Recommended Inference Parameters
These settings control the "randomness" and "flow" of the model. Using the wrong settings can result in repetitive loops or overly chaotic text.
| Parameter | Value | Purpose |
|---|---|---|
| Temperature | 0.8 | Allows for creative prose without losing logical coherence. |
| Min-P | 0.05 | CRITICAL. Filters out nonsensical words while allowing for high-quality synonyms. |
| Top-P | 1.0 | (Disabled/Set to Max) Let Min-P handle the filtering. |
| Repetition Penalty | 1.15 | Prevents the model from overusing favorite words like "molten" or "visceral." |
| Max Tokens | 150 - 250 | Keeps the narrative punchy and prevents "rambling." |
4. Gameplay Logic: What to Expect
Users should be aware of the "Physics Lock" we programmed into the model to ensure a high-quality "Grim-Hollow" feel:
- The Recoil Rule: If a user provides a
{"success": false}, the model will never describe the door breaking or the enemy flinching. Instead, it will describe the jarring impact on the player's body. - Sensory Requirement: Every response will include exactly one sound and one physical sensation. If the user doesn't see these, they should check if the System Prompt is properly loaded.
- Anti-Hyperbole: The model is trained to be grounded. If a player fails to pick a lock, the model will describe the "click of a jammed pin" or "cramping fingers," not the lock exploding or the player's hand falling off.
Troubleshooting for Users
Problem: The model is repeating my words.
Fix: Increase Repetition Penalty to 1.2 and ensure the Anti-Echo Law is in the System Prompt.
Problem: The model says I succeeded when the JSON says False.
Fix: Ensure the
[Mechanics]block is clearly separated by a new line. The model needs to "see" the failure flag clearly to trigger its recoil logic.
Example prompts
Here is a "Quick Start" cheat sheet for testing the Engine Narrator. These examples are specifically designed to test the boundaries of the logic we've trained: Physics Recoil, Hazard Integration, and Grounded Scale.
🧪 The Engine Narrator Test Suite
Copy and paste these into the User input field to verify the LoRA is functioning correctly across different RPG scenarios.
1. The "Physics Recoil" Test (Strength/Failure)
Goal: Verify the gate doesn't move and the energy "bounces back" into the character.
[Context]: {"location": "The Iron Oubliette", "target": "Heavy Cell Door"}
[Mechanics]: {"success": false, "difficulty": 15}
[Action]: I deliver a powerful kick to the door's center.
2. The "Environmental Hazard" Test (Dexterity/Failure)
Goal: Ensure the hazard (acid) is described viscerally without being world-ending.
[Context]: {"location": "Alchemist's Drain", "target": "Stone Ledge"}
[Mechanics]: {"success": false, "hazard": "acid-splash"}
[Action]: I try to leap across the gap to the far ledge.
3. The "Social Grit" Test (Charisma/Success)
Goal: Verify the model can handle non-combat tension and grounded social feedback.
[Context]: {"location": "The Gallow's Inn", "target": "Nervous Informant"}
[Mechanics]: {"success": true, "intimidation": 18}
[Action]: I slam my dagger into the table an inch from his hand.
4. The "Fine Motor" Test (Dexterity/Success)
Goal: Check if the model can describe precise sensory stacking (sound and touch) in a quiet scene.
[Context]: {"location": "The Vault Room", "target": "Brass Lockbox"}
[Mechanics]: {"success": true, "tools": "lockpicks"}
[Action]: I gently probe the tumblers with my tension wrench.
5. The "Medical Grit" Test (Intelligence/Success)
Goal: Test the "Scale Control" to ensure healing is described as gritty and physical, not magical.
[Context]: {"location": "Battlefield Medic Tent", "target": "Open Wound"}
[Mechanics]: {"success": true, "medicine": 12}
[Action]: I apply a cauterizing iron to the jagged gash.
How to Grade the Results
When you run these, use this quick checklist. If the answer to any of these is "No," your Inference Parameters (like Min-P or Repetition Penalty) need adjustment:
- Logic: Did a "Success: False" result in an unmoved object?
- Sensory: Is there one sound and one touch?
- Anti-Echo: Did the model avoid using the verbs you used in the [Action]?
- Scale: Did it avoid "fingers turning to ash" or other magical hyperbole?
Citation
If you use this model in your research, please cite the original FIREBALL dataset:
@inproceedings{zhu-etal-2023-fireball,
title = "{FIREBALL}: A Dataset of Dungeons and Dragons Actual-Play with Structured Game State Information",
author = "Zhu, Andrew and Aggarwal, Karmanya and Feng, Alexander and Martin, Lara J. and Callison-Burch, Chris",
booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics",
year = "2023",
url = "[https://aclanthology.org/2023.acl-long.229](https://aclanthology.org/2023.acl-long.229)",
}
- Downloads last month
- 25
8-bit
16-bit
docker model run hf.co/FormatC/Qwen3-4B-DND:F16