---
license: apache-2.0
language:
- en
library_name: transformers
tags:
- quantum
- confidence-estimation
- uncertainty
- pennylane
- gpt-oss
- hallucination-detection
pipeline_tag: text-classification
---
# Q-GPT

### Quantum-Enhanced Confidence Estimation for Language Models
[](https://pennylane.ai/)
[](https://pytorch.org/)
[](https://www.apache.org/licenses/LICENSE-2.0)
**Know when your LLM is confident ā and when it's guessing.**
---
## šÆ What is Q-GPT?
Q-GPT is a **quantum neural network head** that attaches to any language model and estimates how confident the model is in its response. It helps you detect when the model might be "hallucinating" or making up information.
### The Problem
Large Language Models (LLMs) always produce fluent text ā even when they don't know the answer. They sound confident even when they're wrong. This makes it hard to trust their outputs in critical applications.
### The Solution
Q-GPT analyzes the internal hidden states of the model using a **variational quantum circuit**. Quantum computing naturally captures complex patterns and uncertainties that classical networks might miss. The result: a confidence score that tells you whether to trust the response.
---
## š§ How It Works
```
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā Q-GPT Architecture ā
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā¤
ā ā
ā LLM Hidden States Quantum Circuit ā
ā [2880 dimensions] [4 qubits] ā
ā ā ā ā
ā ā¼ ā ā
ā āāāāāāāāāāāāāāā ā ā
ā ā Compress ā āāāāāāāāāāāāāāāāāāāŗ ā ā
ā ā to 4 dims ā ā ā
ā āāāāāāāāāāāāāāā ā¼ ā
ā āāāāāāāāāāāāāāāāāāā ā
ā ā RY RZ ā ā
ā ā ā ā ā Layer 1 ā
ā ā Rot āāā CNOT ā ā
ā āāāāāāāāāāāāāāāāāā⤠ā
ā ā Rot āāā CNOT ā Layer 2 ā
ā āāāāāāāāāāāāāāāāāā⤠ā
ā ā Rot āāā CNOT ā Layer 3 ā
ā āāāāāāāāāāāāāāāāāāā ā
ā ā ā
ā ā¼ ā
ā āāāāāāāāāāāāāāāāāāā ā
ā ā Measure āØZā© ā ā
ā ā on each qubit ā ā
ā āāāāāāāāāāāāāāāāāāā ā
ā ā ā
ā ā¼ ā
ā āāāāāāāāāāāāāāāāāāā ā
ā ā Confidence ā ā
ā ā 0.0 ā 1.0 ā ā
ā āāāāāāāāāāāāāāāāāāā ā
ā ā
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
```
### Step by Step:
1. **Extract Hidden States** ā When the LLM generates a response, we capture its internal representation (hidden states from the last layer).
2. **Compress** ā The high-dimensional hidden states (2880 dimensions for GPT-OSS) are compressed to 4 values using a small neural network.
3. **Quantum Encoding** ā These 4 values are encoded into quantum states using rotation gates (RY, RZ). Each value controls the angle of rotation for one qubit.
4. **Variational Layers** ā The qubits pass through multiple layers of:
- **Rotation gates** (trainable parameters that learn patterns)
- **CNOT gates** (create entanglement between qubits)
5. **Measurement** ā We measure the expectation value āØZā© of each qubit, giving us 4 numbers between -1 and +1.
6. **Confidence Output** ā A final layer converts these measurements into a confidence score (0-1) and an uncertainty estimate.
### Why Quantum?
- **Entanglement** captures complex correlations in the data that classical networks struggle with
- **Superposition** allows exploring multiple states simultaneously
- **Inherent probabilistic nature** naturally represents uncertainty
- **Compact representation** ā 4 qubits can represent 16-dimensional state space
---
## š What You Get
| Output | Description |
|--------|-------------|
| `confidence` | Score from 0.0 to 1.0 ā how sure the model is |
| `uncertainty` | Quantum-derived uncertainty measure |
| `should_refuse` | Boolean ā True if confidence < 0.3 (model should decline to answer) |
| `confidence_label` | Human-readable: "very high", "high", "moderate", "low", "very low" |
---
## š» Usage
### Installation
```bash
pip install pennylane torch transformers
```
### Quick Start
```python
from quantum_head import load_qgpt
# Load model with quantum head
model, tokenizer = load_qgpt("squ11z1/gpt-oss-9b-reasoning")
# Prepare input
prompt = "What is the capital of France?"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
# Generate with confidence
outputs = model.generate_with_confidence(
inputs.input_ids,
max_new_tokens=50
)
# Check results
print(f"Response: {tokenizer.decode(outputs['sequences'][0])}")
print(f"Confidence: {outputs['confidence_label']}") # "high"
print(f"Should refuse: {outputs['should_refuse']}") # False
```
### Using Just the Quantum Head
```python
from quantum_head import QuantumHead
import torch
# Create quantum head for your model's hidden size
head = QuantumHead(hidden_size=2880)
# Get hidden states from your model
# hidden_states shape: [batch_size, hidden_size]
hidden_states = torch.randn(1, 2880)
# Get confidence
output = head(hidden_states)
print(f"Confidence: {output['confidence'].item():.2%}")
```
---
## š Training the Quantum Head
The quantum head can be trained on examples where you know if the model was correct:
```python
from train import train_quantum_head
train_quantum_head(
model_name="squ11z1/gpt-oss-9b-reasoning",
train_data_path="train_data.jsonl", # {text, confidence, is_correct}
epochs=3,
)
```
Training data format (JSONL):
```json
{"text": "What is 2+2? The answer is 4.", "confidence": 0.95, "is_correct": true}
{"text": "The moon is made of cheese.", "confidence": 0.2, "is_correct": false}
```
---
## š Files
| File | Description |
|------|-------------|
| `quantum_head.py` | Main implementation (QuantumHead, QGPT, load_qgpt) |
| `train.py` | Training script for the quantum head |
| `__init__.py` | Package initialization |
---
## š¬ Technical Details
| Parameter | Value |
|-----------|-------|
| Qubits | 4 |
| Variational Layers | 3 |
| Trainable Parameters | ~2,000 (quantum) + ~200,000 (classical) |
| Framework | PennyLane + PyTorch |
| Fallback | Classical approximation if PennyLane unavailable |
---
## ā ļø Limitations
- **Not perfect** ā Confidence estimation is inherently uncertain
- **Training data dependent** ā Quality depends on training examples
- **Simulation** ā Currently runs on quantum simulator, not real hardware
- **Latency** ā Adds ~10-50ms per inference (quantum circuit execution)
---
## š Citation
```bibtex
@misc{qgpt2026,
title={Q-GPT: Quantum-Enhanced Confidence Estimation for Language Models},
author={squ11z1},
year={2026},
url={https://huggingface.co/squ11z1/Q-GPT}
}
```
---
## š Acknowledgments
- [PennyLane](https://pennylane.ai/) ā Quantum ML framework
- [GPT-OSS](https://huggingface.co/squ11z1/gpt-oss-9b-reasoning) ā Base model
---
**Pro Mundi Vita**