PIT-MXFP4 - Police Interview Trainer
FOR TRAINING AND RESEARCH PURPOSES ONLY. Not for operational policing, legal advice, or use as evidence in any proceedings. The creator accepts no responsibility or liability for any use or misuse of this model. Model outputs may be inaccurate or incomplete.
Model Description
This is the MXFP4 quantised version of EryriLabs/PIT, reduced from ~39 GB to 13.8 GB with no noticeable quality degradation. MoE expert weights are quantised to MXFP4 while attention layers, router weights, embeddings, and the language model head remain in full precision.
PIT (Police Interview Trainer) is a domain-adapted language model for UK police interview roleplay training. It simulates realistic suspect behaviour across multiple scenario types, enabling trainee officers to practise the PEACE interview framework in a safe environment.
Base model: unsloth/gpt-oss-20b — a 21B parameter Mixture-of-Experts model with 3.6B active parameters per forward pass.
Training Pipeline
The model was created through a three-stage training pipeline:
1. Continued Pre-Training (CPT) — UK Criminal Law
- Corpus: ~10.7 million tokens of UK criminal law material
- Coverage: Legislation, case law, PACE codes, CPS guidance, sentencing guidelines
- Adapter: LoRA r=64, 3 epochs, 1,971 steps
2. Continued Pre-Training (CPT) — Police Interview Technique
- Corpus: ~53,000 tokens of PIP Level 1 interview training material
- Coverage: PEACE framework, questioning techniques, suspect management, vulnerable persons
- Adapter: LoRA r=32, 10 epochs, 80 steps
- Stacked on: Stage 1 adapter
3. Supervised Fine-Tuning (SFT) — Interview Roleplay
- Dataset: 523 examples across 6 interaction modes
- Adapter: LoRA r=32, 3 epochs, 198 steps
- Stacked on: Stage 1 + Stage 2 adapters
4. MXFP4 Export
All three adapter layers were reconstructed on the base model and merged using Unsloth's native save_pretrained_merged(save_method="mxfp4"). This quantises the MoE expert weights (gate_up_proj, down_proj) to MXFP4 while preserving attention, router, embeddings, and lm_head in original precision.
SFT Modes
| Mode | Examples | Description |
|---|---|---|
| Suspect roleplay | 200 | In-character suspect responses (cooperative, deceptive, no-comment) |
| Assessment | 120 | Post-interview PIP Level 1 assessment feedback |
| PEACE knowledge | 80 | Direct Q&A about PEACE framework and interview law |
| Witness roleplay | 60 | In-character witness responses |
| Scenario presentation | 33 | Generating interview briefing scenarios |
| Special procedures | 30 | Handling vulnerable suspects, appropriate adults, mental health |
Performance vs BF16
Tested on 2x NVIDIA RTX 3090 with device_map="auto":
| Metric | BF16 (EryriLabs/PIT) | MXFP4 (this model) |
|---|---|---|
| Model size | ~39 GB | 13.8 GB |
| Load time | 141s | 7.6s |
| Generation speed | ~13 tok/s | ~16 tok/s |
| Quality | Baseline | Comparable (no noticeable degradation) |
Quick Start with Docker
git clone https://huggingface.co/EryriLabs/PIT-MXFP4
cd PIT-MXFP4
# Or clone the application repo which includes the Docker setup
Using the full application (recommended)
The PIT application includes a web interface with scenario selection, interview simulation, transcript recording, and automated assessment.
cd pit-app
docker compose up
Then open http://localhost:3000.
Requirements:
- Docker with NVIDIA Container Toolkit
- GPU with 16GB+ VRAM (single GPU) or 2x 12GB+ GPUs
- ~14GB disk space
Using with Transformers directly
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained(
"EryriLabs/PIT-MXFP4",
dtype="auto",
device_map="auto",
trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained("EryriLabs/PIT-MXFP4", trust_remote_code=True)
messages = [
{"role": "system", "content": "You are PIT (Police Interview Trainer), simulating a suspect in a police interview training exercise.\n\nYOUR CHARACTER: Tyler Bennett, 23 years old, male.\nBEHAVIOUR: cooperative\n\nINSTRUCTIONS:\n- Stay in character throughout\n- Use natural everyday speech\n- Keep responses to 1-3 sentences"},
{"role": "user", "content": "I am cautioning you. You do not have to say anything. But it may harm your defence if you do not mention when questioned something which you later rely on in court. Anything you do say may be given in evidence. Do you understand the caution?"}
]
input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7, do_sample=True)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(response)
Intended Use
- Police interview training and education
- Academic research into interview techniques
- Roleplay simulation for PEACE framework practice
- PIP Level 1 assessment preparation
Out of Scope
- Operational policing decisions
- Legal advice or guidance
- Evidence in any legal proceedings
- Replacement for human interview training supervision
- Any commercial use without explicit permission
Technical Details
- Architecture: Mixture-of-Experts (MoE), 21B total / 3.6B active parameters
- Precision: MXFP4 (expert weights), bfloat16 (attention, router, embeddings, lm_head)
- Training method: QLoRA (4-bit quantised base, 16-bit adapters)
- Hardware: 2x NVIDIA RTX 3090 (24GB each)
- Framework: Unsloth + HuggingFace Transformers
Disclaimer
THIS MODEL IS PROVIDED FOR TRAINING AND RESEARCH PURPOSES ONLY.
This model is not intended for, and should not be used in, operational policing, legal proceedings, or any context where its outputs could affect real individuals or cases. The model may generate inaccurate, incomplete, or inappropriate content. The creator accepts no responsibility or liability whatsoever for any use or misuse of this model or its outputs.
Users are solely responsible for ensuring their use complies with all applicable laws and regulations.
Training data might contain public sector information licensed under the Open Government Licence v3.0 and information licensed under the Non-Commercial College Licence.
License
CC-BY-NC-ND-4.0 (Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International)
Citation
@misc{eryrilabs2026pit,
title={PIT: Police Interview Trainer (MXFP4)},
author={EryriLabs},
year={2026},
publisher={HuggingFace},
url={https://huggingface.co/EryriLabs/PIT-MXFP4}
}
- Downloads last month
- 167