squ11z1
/

Q-GPT

@@ -1,99 +1,230 @@
-# Q-GPT: Quantum-Enhanced GPT
-A quantum neural network head that adds confidence estimation to GPT models.
-## Features
-- 🔮 **Variational Quantum Circuit** - Uses PennyLane for true quantum computing simulation
-- 📊 **Confidence Estimation** - Estimates how confident the model is in its response
-- 🚫 **Refusal Detection** - Identifies when the model should refuse to answer
-- ⚡ **Classical Fallback** - Works without PennyLane using classical approximation
-## Installation
 ```bash
 pip install pennylane torch transformers
 ```
-## Usage
 ### Quick Start
 ```python
 from quantum_head import load_qgpt
-# Load Q-GPT
-model, tokenizer = load_qgpt(
-    "squ11z1/gpt-oss-9b-reasoning",
-    torch_dtype="auto",
-    device="auto",
-)
 # Generate with confidence
-inputs = tokenizer("What is 2 + 2?", return_tensors="pt").to(model.device)
-outputs = model.generate_with_confidence(inputs.input_ids, max_new_tokens=50)
 print(f"Response: {tokenizer.decode(outputs['sequences'][0])}")
-print(f"Confidence: {outputs['confidence_label']}")  # e.g., "high"
-print(f"Should refuse: {outputs['should_refuse']}")
 ```
-### Just the Quantum Head
 ```python
 from quantum_head import QuantumHead
 import torch
-# Create quantum head
-head = QuantumHead(hidden_size=2880)  # Match your model's hidden size
-# Forward pass with hidden states
-hidden_states = torch.randn(1, 2880)  # From your model
-output = head(hidden_states)
-print(f"Confidence: {output['confidence'].item():.2f}")
-print(f"Uncertainty: {output['uncertainty'].item():.2f}")
 ```
-### Training
-```bash
-# Create synthetic training data
-python train.py --model squ11z1/gpt-oss-9b-reasoning --create-data --data train.jsonl
-# Train quantum head
-python train.py --model squ11z1/gpt-oss-9b-reasoning --data train.jsonl --epochs 3
-```
-## Architecture
 ```
-Hidden States → [Classical Compression] → [Quantum Circuit] → [Post-Processing] → Confidence
-     ↓                    ↓                      ↓                    ↓
- [B, H]            [B, n_qubits]          [B, n_qubits]           [B, 2]
-                                                                     ↓
-                                                     confidence + uncertainty
 ```
-### Quantum Circuit
-```
-|0⟩ ─ RY(x₀) ─ RZ(x₀) ─ Rot(θ) ─ ●─────── Rot(θ) ─ ... ─ ⟨Z⟩
-                                 │
-|0⟩ ─ RY(x₁) ─ RZ(x₁) ─ Rot(θ) ─ ⊕ ─ ●─── Rot(θ) ─ ... ─ ⟨Z⟩
-                                     │
-|0⟩ ─ RY(x₂) ─ RZ(x₂) ─ Rot(θ) ───── ⊕ ─ ●─ Rot(θ) ─ ... ─ ⟨Z⟩
-                                         │
-|0⟩ ─ RY(x₃) ─ RZ(x₃) ─ Rot(θ) ───────── ⊕ ─ Rot(θ) ─ ... ─ ⟨Z⟩
-```
-## Files
-- `quantum_head.py` - Main implementation (QuantumHead, QGPT, load_qgpt)
-- `train.py` - Training script for quantum head
-- `quantum_head.pt` - Pre-trained weights (after training)
-## Citation
 ```bibtex
 @misc{qgpt2026,
@@ -104,6 +235,17 @@ Hidden States → [Classical Compression] → [Quantum Circuit] → [Post-Proces
 }
 ```
-## License
-Apache 2.0

+---
+license: apache-2.0
+language:
+  - en
+library_name: transformers
+tags:
+  - quantum
+  - confidence-estimation
+  - uncertainty
+  - pennylane
+  - gpt-oss
+  - hallucination-detection
+pipeline_tag: text-classification
+---
+<div align="center">
+# 🔮 Q-GPT
+### Quantum-Enhanced Confidence Estimation for Language Models
+[![PennyLane](https://img.shields.io/badge/PennyLane-Quantum_ML-6C3483?style=for-the-badge)](https://pennylane.ai/)
+[![PyTorch](https://img.shields.io/badge/PyTorch-Compatible-EE4C2C?style=for-the-badge&logo=pytorch)](https://pytorch.org/)
+[![License](https://img.shields.io/badge/License-Apache_2.0-green?style=for-the-badge)](https://www.apache.org/licenses/LICENSE-2.0)
+**Know when your LLM is confident — and when it's guessing.**
+</div>
+---
+## 🎯 What is Q-GPT?
+Q-GPT is a **quantum neural network head** that attaches to any language model and estimates how confident the model is in its response. It helps you detect when the model might be "hallucinating" or making up information.
+### The Problem
+Large Language Models (LLMs) always produce fluent text — even when they don't know the answer. They sound confident even when they're wrong. This makes it hard to trust their outputs in critical applications.
+### The Solution
+Q-GPT analyzes the internal hidden states of the model using a **variational quantum circuit**. Quantum computing naturally captures complex patterns and uncertainties that classical networks might miss. The result: a confidence score that tells you whether to trust the response.
+---
+## 🧠 How It Works
+```
+┌─────────────────────────────────────────────────────────────────┐
+│                        Q-GPT Architecture                        │
+├─────────────────────────────────────────────────────────────────┤
+│                                                                   │
+│  LLM Hidden States                Quantum Circuit                 │
+│  [2880 dimensions]                [4 qubits]                      │
+│         │                              │                          │
+│         ▼                              │                          │
+│  ┌─────────────┐                       │                          │
+│  │  Compress   │  ──────────────────►  │                          │
+│  │  to 4 dims  │                       │                          │
+│  └─────────────┘                       ▼                          │
+│                               ┌─────────────────┐                 │
+│                               │   RY   RZ       │                 │
+│                               │   │    │        │  Layer 1        │
+│                               │   Rot ─●─ CNOT  │                 │
+│                               ├─────────────────┤                 │
+│                               │   Rot ─●─ CNOT  │  Layer 2        │
+│                               ├─────────────────┤                 │
+│                               │   Rot ─●─ CNOT  │  Layer 3        │
+│                               └─────────────────┘                 │
+│                                        │                          │
+│                                        ▼                          │
+│                               ┌─────────────────┐                 │
+│                               │  Measure ⟨Z⟩    │                 │
+│                               │  on each qubit  │                 │
+│                               └─────────────────┘                 │
+│                                        │                          │
+│                                        ▼                          │
+│                               ┌─────────────────┐                 │
+│                               │   Confidence    │                 │
+│                               │   0.0 — 1.0     │                 │
+│                               └─────────────────┘                 │
+│                                                                   │
+└─────────────────────────────────────────────────────────────────┘
+```
+### Step by Step:
+1. **Extract Hidden States** — When the LLM generates a response, we capture its internal representation (hidden states from the last layer).
+2. **Compress** — The high-dimensional hidden states (2880 dimensions for GPT-OSS) are compressed to 4 values using a small neural network.
+3. **Quantum Encoding** — These 4 values are encoded into quantum states using rotation gates (RY, RZ). Each value controls the angle of rotation for one qubit.
+4. **Variational Layers** — The qubits pass through multiple layers of:
+   - **Rotation gates** (trainable parameters that learn patterns)
+   - **CNOT gates** (create entanglement between qubits)
+5. **Measurement** — We measure the expectation value ⟨Z⟩ of each qubit, giving us 4 numbers between -1 and +1.
+6. **Confidence Output** — A final layer converts these measurements into a confidence score (0-1) and an uncertainty estimate.
+### Why Quantum?
+- **Entanglement** captures complex correlations in the data that classical networks struggle with
+- **Superposition** allows exploring multiple states simultaneously
+- **Inherent probabilistic nature** naturally represents uncertainty
+- **Compact representation** — 4 qubits can represent 16-dimensional state space
+---
+## 📊 What You Get
+| Output | Description |
+|--------|-------------|
+| `confidence` | Score from 0.0 to 1.0 — how sure the model is |
+| `uncertainty` | Quantum-derived uncertainty measure |
+| `should_refuse` | Boolean — True if confidence < 0.3 (model should decline to answer) |
+| `confidence_label` | Human-readable: "very high", "high", "moderate", "low", "very low" |
+---
+## 💻 Usage
+### Installation
 ```bash
 pip install pennylane torch transformers
 ```
 ### Quick Start
 ```python
 from quantum_head import load_qgpt
+# Load model with quantum head
+model, tokenizer = load_qgpt("squ11z1/gpt-oss-9b-reasoning")
+# Prepare input
+prompt = "What is the capital of France?"
+inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
 # Generate with confidence
+outputs = model.generate_with_confidence(
+    inputs.input_ids,
+    max_new_tokens=50
+)
+# Check results
 print(f"Response: {tokenizer.decode(outputs['sequences'][0])}")
+print(f"Confidence: {outputs['confidence_label']}")  # "high"
+print(f"Should refuse: {outputs['should_refuse']}")  # False
 ```
+### Using Just the Quantum Head
 ```python
 from quantum_head import QuantumHead
 import torch
+# Create quantum head for your model's hidden size
+head = QuantumHead(hidden_size=2880)
+# Get hidden states from your model
+# hidden_states shape: [batch_size, hidden_size]
+hidden_states = torch.randn(1, 2880)
+# Get confidence
+output = head(hidden_states)
+print(f"Confidence: {output['confidence'].item():.2%}")
 ```
+---
+## 🎓 Training the Quantum Head
+The quantum head can be trained on examples where you know if the model was correct:
+```python
+from train import train_quantum_head
+train_quantum_head(
+    model_name="squ11z1/gpt-oss-9b-reasoning",
+    train_data_path="train_data.jsonl",  # {text, confidence, is_correct}
+    epochs=3,
+)
 ```
+Training data format (JSONL):
+```json
+{"text": "What is 2+2? The answer is 4.", "confidence": 0.95, "is_correct": true}
+{"text": "The moon is made of cheese.", "confidence": 0.2, "is_correct": false}
 ```
+---
+## 📁 Files
+| File | Description |
+|------|-------------|
+| `quantum_head.py` | Main implementation (QuantumHead, QGPT, load_qgpt) |
+| `train.py` | Training script for the quantum head |
+| `__init__.py` | Package initialization |
+---
+## 🔬 Technical Details
+| Parameter | Value |
+|-----------|-------|
+| Qubits | 4 |
+| Variational Layers | 3 |
+| Trainable Parameters | ~2,000 (quantum) + ~200,000 (classical) |
+| Framework | PennyLane + PyTorch |
+| Fallback | Classical approximation if PennyLane unavailable |
+---
+## ⚠️ Limitations
+- **Not perfect** — Confidence estimation is inherently uncertain
+- **Training data dependent** — Quality depends on training examples
+- **Simulation** — Currently runs on quantum simulator, not real hardware
+- **Latency** — Adds ~10-50ms per inference (quantum circuit execution)
+---
+## 📖 Citation
 ```bibtex
 @misc{qgpt2026,
 }
 ```
+---
+## 🙏 Acknowledgments
+- [PennyLane](https://pennylane.ai/) — Quantum ML framework
+- [GPT-OSS](https://huggingface.co/squ11z1/gpt-oss-9b-reasoning) — Base model
+---
+<div align="center">
+**Built with 🔮 Quantum Computing and ❤️**
+</div>