DutyBot GGUF
A domain-adapted language model for UK policing β offences, points to prove, PACE powers, and operational guidance. Built for the DutyBot Docker application. For training and educational purposes only.
Model Details
| Base model | unsloth/gpt-oss-20b |
| Architecture | Mixture of Experts β 21B total parameters, 3.6B active per token |
| Training method | Continued pretraining with QLoRA (rank 64, bf16) |
| Corpus | 10,511 chunks (~10.7M tokens) of UK criminal law |
| Training loss | 3.90 β 1.73 |
| Context length | 131,072 (native), trained at 1,024, tested at 4,096 |
| Quantisation | Q4_K_M |
| File size | ~14.7 GB |
| Chat template | ChatML (<|im_start|> / <|im_end|>) |
How to Use
With DutyBot (recommended)
The easiest way to use this model is with the DutyBot Docker app which provides a full chat UI, conversation history, memory, and automatic legislation verification:
git clone https://github.com/dwain-barnes/dutybot.git
cd dutybot
docker compose up
# Open http://localhost:5000
With llama.cpp directly
# Download
huggingface-cli download EryriLabs/dutybot-GGUF domain_adapted-Q4_K_M.gguf --local-dir ./models
# Run server
llama-server \
--model ./models/domain_adapted-Q4_K_M.gguf \
--host 0.0.0.0 --port 8080 \
--ctx-size 4096 \
--n-gpu-layers 999 \
--chat-template chatml
Then query the OpenAI-compatible API:
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "dutybot",
"messages": [
{"role": "system", "content": "You are DutyBot, a UK Police Duty Assistant for training purposes."},
{"role": "user", "content": "What are the points to prove for Section 18 GBH?"}
],
"max_tokens": 512,
"temperature": 0.3,
"stop": ["<|im_end|>", "<|im_start|>"],
"frequency_penalty": 0.6
}'
With Python (llama-cpp-python)
from llama_cpp import Llama
llm = Llama(
model_path="./models/domain_adapted-Q4_K_M.gguf",
n_ctx=4096,
n_gpu_layers=-1,
chat_format="chatml",
)
response = llm.create_chat_completion(
messages=[
{"role": "system", "content": "You are DutyBot, a UK Police Duty Assistant for training purposes."},
{"role": "user", "content": "Explain the difference between ABH and GBH"},
],
max_tokens=512,
temperature=0.3,
stop=["<|im_end|>", "<|im_start|>"],
frequency_penalty=0.6,
)
print(response["choices"][0]["message"]["content"])
Training Details
Corpus
The training corpus covers UK criminal law across these domains:
- Criminal offences β definitions, elements, and points to prove for offences under major UK statutes (Theft Act 1968, Offences Against the Person Act 1861, Criminal Damage Act 1971, Sexual Offences Act 2003, Misuse of Drugs Act 1971, and others)
- PACE β Police and Criminal Evidence Act 1984 codes of practice (stop and search, arrest, detention, investigation, identification)
- Sentencing β Sentencing Council guidelines and magistrates' court sentencing guidelines
- CPS guidance β Crown Prosecution Service charging standards and legal guidance
- Operational policing β powers, procedures, and general policing knowledge The corpus was structured as 10,511 text chunks, totalling approximately 10.7 million tokens.
Method
- Continued pretraining (CPT) β the model was exposed to the full corpus to inject domain knowledge, rather than instruction-tuning for a specific format
- QLoRA β 4-bit quantised base weights with rank-64 LoRA adapters in bf16, reducing GPU memory requirements
- Hyperparameters:
- Learning rate: 5e-5 with cosine schedule
- Batch size: 1 with 16 gradient accumulation steps (effective batch 16)
- Sequence length: 1,024 tokens
- Epochs: 3
- Total steps: 1,971
- Hardware: 2x NVIDIA RTX 3090 (24GB each)
- Software: Unsloth + HuggingFace Transformers + TRL
Loss Curve
| Step | Training Loss |
|---|---|
| 0 | 3.90 |
| 100 | 1.94 |
| 500 | 1.80 |
| 670 | 1.73 |
| 1000 | ~1.65 |
| The loss showed healthy, monotonic decline indicating successful knowledge injection without catastrophic forgetting. |
Intended Use
In scope
- Police training exercises and scenario planning
- Educational materials about UK criminal law
- Studying offence definitions, points to prove, and powers
- Building training tools for police forces and law enforcement academies
Out of scope
- Live operational policing decisions β this model is not a substitute for professional legal advice, force policy, or the judgement of trained officers
- Legal advice β the model may produce inaccurate or incomplete legal information
- Jurisdictions outside England & Wales β the training data is primarily based on English and Welsh law; Scottish and Northern Irish law differ significantly
Limitations
- May fabricate legal definitions β like all language models, DutyBot can generate plausible-sounding but incorrect legal information. Always verify against official sources.
- Training data currency β the corpus reflects law as of the training date. Legislation changes frequently.
- Repetition β the model can sometimes repeat itself, especially on longer generations. Using
frequency_penalty: 0.6andmax_tokens: 512helps mitigate this. - No case law β the training data focuses on statute law and guidance rather than case law precedents.
System Prompt
For best results, use this system prompt:
You are DutyBot, a UK Police Duty Assistant. You help police officers with
operational guidance, definitions of offences, points to prove, and general
policing knowledge based on UK law.
IMPORTANT CONSTRAINTS:
- You are for TRAINING AND EDUCATIONAL PURPOSES ONLY β never for live operational use
- Always encourage officers to verify guidance against local force policy and official sources
- Be professional, precise, and cite legislation where possible
- If unsure, say so clearly β never fabricate legal definitions
- When legislation lookup results are provided, use them to ground your answer
Recommended Inference Parameters
| Parameter | Value | Notes |
|---|---|---|
temperature |
0.3 | Low temperature for factual responses |
max_tokens |
512 | Prevents repetition on long outputs |
frequency_penalty |
0.6 | Reduces repetitive phrasing |
presence_penalty |
0.3 | Encourages topic diversity |
stop |
["<|im_end|>", "<|im_start|>"] |
Proper turn boundaries |
ctx_size |
4096 | Good balance of context and speed |
Hardware Requirements
| Setup | VRAM | Speed |
|---|---|---|
| 2x RTX 3090 (full offload) | ~16GB total | Fast |
| 1x RTX 3090/4090 (partial offload) | 24GB | Moderate |
| CPU only | 0 (uses RAM) | Slow (~1-2 tok/s) |
| Minimum 16GB system RAM recommended. The GGUF file itself is 14.7GB. |
Citation
@misc{dutybot2026,
title={DutyBot: A Domain-Adapted Language Model for UK Police Training},
author={EryriLabs},
year={2026},
url={https://huggingface.co/EryriLabs/dutybot-GGUF}
}
Disclaimer
This model and associated software are provided strictly for research and educational purposes only. They are not intended for production use, operational deployment, or commercial purposes.
- No warranty: This model is provided "as is", without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose, or non-infringement.
- No liability: The author(s) accept no responsibility or liability for any errors, omissions, or outcomes arising from the use of this model or its outputs.
- Not legal advice: Nothing produced by this model constitutes legal, professional, or operational advice. Outputs may be inaccurate, incomplete, or outdated. Always consult qualified professionals and official sources.
- Not for operational policing: This model must not be used for live operational decision-making. It is not a substitute for professional judgement, force policy, or official legal guidance.
- Non-commercial use only: The model weights are licensed under CC-BY-NC-ND-4.0 and must not be used for commercial purposes.
- Use at your own risk: You are solely responsible for how you use this model and any decisions made based on its output.
License
CC-BY-NC-ND-4.0 β Non-commercial use only. No derivatives without permission. The training corpus contains Crown copyright material used under the Open Government Licence.
Acknowledgements
- GPT-OSS 20B base model
- Unsloth for efficient QLoRA training
- llama.cpp for GGUF inference
- UK legislation sourced from legislation.gov.uk
- Downloads last month
- 70
4-bit