Instructions to use EryriLabs/PIT-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use EryriLabs/PIT-GGUF with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="EryriLabs/PIT-GGUF",
	filename="pit_q8_0.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use EryriLabs/PIT-GGUF with llama.cpp:

Install (macOS, Linux)

curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf EryriLabs/PIT-GGUF:Q8_0
# Run inference directly in the terminal:
llama cli -hf EryriLabs/PIT-GGUF:Q8_0

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf EryriLabs/PIT-GGUF:Q8_0
# Run inference directly in the terminal:
llama cli -hf EryriLabs/PIT-GGUF:Q8_0

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf EryriLabs/PIT-GGUF:Q8_0
# Run inference directly in the terminal:
./llama-cli -hf EryriLabs/PIT-GGUF:Q8_0

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf EryriLabs/PIT-GGUF:Q8_0
# Run inference directly in the terminal:
./build/bin/llama-cli -hf EryriLabs/PIT-GGUF:Q8_0

Use Docker

docker model run hf.co/EryriLabs/PIT-GGUF:Q8_0

LM Studio
Jan
Ollama
How to use EryriLabs/PIT-GGUF with Ollama:
```
ollama run hf.co/EryriLabs/PIT-GGUF:Q8_0
```

Unsloth Studio

How to use EryriLabs/PIT-GGUF with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for EryriLabs/PIT-GGUF to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for EryriLabs/PIT-GGUF to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for EryriLabs/PIT-GGUF to start chatting

How to use EryriLabs/PIT-GGUF with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf EryriLabs/PIT-GGUF:Q8_0

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "EryriLabs/PIT-GGUF:Q8_0"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use EryriLabs/PIT-GGUF with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf EryriLabs/PIT-GGUF:Q8_0

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default EryriLabs/PIT-GGUF:Q8_0

Run Hermes

hermes

Atomic Chat new
Docker Model Runner
How to use EryriLabs/PIT-GGUF with Docker Model Runner:
```
docker model run hf.co/EryriLabs/PIT-GGUF:Q8_0
```

Lemonade

How to use EryriLabs/PIT-GGUF with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull EryriLabs/PIT-GGUF:Q8_0

Run and chat with the model

lemonade run user.PIT-GGUF-Q8_0

List all available models

lemonade list

PIT-Q8_0 — Police Interview Trainer (GGUF Q8_0)

FOR TRAINING AND RESEARCH PURPOSES ONLY. Not for operational policing, legal advice, or use as evidence in any proceedings. The creator accepts no responsibility or liability for any use or misuse of this model. Model outputs may be inaccurate or incomplete.

Model Description

This is the Q8_0 GGUF quantised version of EryriLabs/PIT, reduced from ~39 GB (F16) to 21 GB with near-lossless quality. All weights are quantised to 8-bit integers, offering an excellent balance between quality preservation and reduced memory footprint.

PIT (Police Interview Trainer) is a domain-adapted language model for UK police interview roleplay training. It simulates realistic suspect behaviour across multiple scenario types, enabling trainee officers to practise the PEACE interview framework in a safe environment.

Base model: unsloth/gpt-oss-20b — a 21B parameter Mixture-of-Experts model with 3.6B active parameters per forward pass.

Training Pipeline

The model was created through a three-stage training pipeline, with all adapters merged before GGUF conversion:

1. Continued Pre-Training (CPT) — UK Criminal Law

Corpus: ~10.7 million tokens of UK criminal law material
Coverage: Legislation, case law, PACE codes, CPS guidance, sentencing guidelines
Adapter: LoRA r=64, 3 epochs, 1,971 steps

2. Continued Pre-Training (CPT) — Police Interview Technique

Corpus: ~53,000 tokens of PIP Level 1 interview training material
Coverage: PEACE framework, questioning techniques, suspect management, vulnerable persons
Adapter: LoRA r=32, 10 epochs, 80 steps
Stacked on: Stage 1 adapter

3. Supervised Fine-Tuning (SFT) — Interview Roleplay

Dataset: 523 examples across 6 interaction modes
Adapter: LoRA r=32, 3 epochs, 198 steps
Stacked on: Stage 1 + Stage 2 adapters

4. GGUF Q8_0 Export

All three adapter layers were reconstructed on the base model, merged, and converted to GGUF format with Q8_0 quantisation. This applies 8-bit integer quantisation across all weights, providing near-lossless quality compared to the full-precision model.

SFT Modes

Mode	Examples	Description
Suspect roleplay	200	In-character suspect responses (cooperative, deceptive, no-comment)
Assessment	120	Post-interview PIP Level 1 assessment feedback
PEACE knowledge	80	Direct Q&A about PEACE framework and interview law
Witness roleplay	60	In-character witness responses
Scenario presentation	33	Generating interview briefing scenarios
Special procedures	30	Handling vulnerable suspects, appropriate adults, mental health

Available Quantisations

Quantisation	Size	Format	Notes
Q8_0 (this model)	21 GB	GGUF	Near-lossless 8-bit quantisation

Quick Start

Using with llama.cpp

# Download the model
huggingface-cli download EryriLabs/PIT-Q8_0 pit_q8_0.gguf --local-dir .

# Run with llama-server
llama-server -m pit_q8_0.gguf -c 8192 -ngl 99

Then open http://localhost:8080 for the built-in chat UI.

Using with Ollama

# Create a Modelfile
cat <<EOF > Modelfile
FROM ./pit_q8_0.gguf
PARAMETER temperature 0.7
PARAMETER num_ctx 8192
SYSTEM "You are PIT (Police Interview Trainer), simulating a suspect in a police interview training exercise."
EOF

# Create and run
ollama create pit -f Modelfile
ollama run pit

Using with LM Studio

Download pit_q8_0.gguf
Place in your LM Studio models directory
Load the model and begin chatting

Using the full PIT application (recommended)

The PIT application includes a web interface with scenario selection, interview simulation, transcript recording, and automated assessment.

cd pit-app
docker compose up

Then open http://localhost:3000.

Requirements:

GPU with 24GB+ VRAM (single GPU) or 2x 12GB+ GPUs with layer splitting
~21 GB disk space

Example prompt

<|system|>
You are PIT (Police Interview Trainer), simulating a suspect in a police interview training exercise.

YOUR CHARACTER: Tyler Bennett, 23 years old, male.
BEHAVIOUR: cooperative

INSTRUCTIONS:
- Stay in character throughout
- Use natural everyday speech
- Keep responses to 1-3 sentences
<|end|>
<|user|>
I am cautioning you. You do not have to say anything. But it may harm your defence if you do not mention when questioned something which you later rely on in court. Anything you do say may be given in evidence. Do you understand the caution?
<|end|>
<|assistant|>

Intended Use

Police interview training and education
Academic research into interview techniques
Roleplay simulation for PEACE framework practice
PIP Level 1 assessment preparation

Out of Scope

Operational policing decisions
Legal advice or guidance
Evidence in any legal proceedings
Replacement for human interview training supervision
Any commercial use without explicit permission

Technical Details

Architecture: Mixture-of-Experts (MoE), 21B total / 3.6B active parameters
Format: GGUF (Q8_0)
Precision: 8-bit integer quantisation (all weights)
Original precision: BFloat16
Training method: QLoRA (4-bit quantised base, 16-bit adapters)
Hardware: 2x NVIDIA RTX 3090 (24GB each)
Framework: Unsloth + HuggingFace Transformers + llama.cpp

Disclaimer

THIS MODEL IS PROVIDED FOR TRAINING AND RESEARCH PURPOSES ONLY.

This model is not intended for, and should not be used in, operational policing, legal proceedings, or any context where its outputs could affect real individuals or cases. The model may generate inaccurate, incomplete, or inappropriate content. The creator accepts no responsibility or liability whatsoever for any use or misuse of this model or its outputs.

Users are solely responsible for ensuring their use complies with all applicable laws and regulations.

Training data might contain public sector information licensed under the Open Government Licence v3.0 and information licensed under the Non-Commercial College Licence.

License

CC-BY-NC-ND-4.0 (Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International)

Citation

@misc{eryrilabs2026pit,
  title={PIT: Police Interview Trainer (GGUF Q8\_0)},
  author={EryriLabs},
  year={2026},
  publisher={HuggingFace},
  url={https://huggingface.co/EryriLabs/PIT-Q8_0}
}