# 🌌 QSBench: Entanglement Score Regression Guide

Welcome to the **QSBench Regression Hub**.  
This tool demonstrates how Machine Learning can predict the **degree of quantum entanglement** — measured by the **Meyer–Wallach score** — using only circuit structure and topology.

---

## ⚠️ Important: Demo Dataset Notice

This Space uses **demo shards** of the QSBench datasets.

- **Limited size:** The dataset is intentionally reduced.
- **Impact:** Model performance may be unstable or noisy.
- **Goal:** Showcase how structural features correlate with entanglement — not achieve production-level accuracy.

---

## 🧠 1. What is Being Predicted?

The model predicts:

### `meyer_wallach`

A continuous entanglement measure:

- **0.0 → No entanglement**
- **1.0 → Maximum entanglement**

This metric captures how strongly qubits are globally correlated in a circuit.

---

## 🧩 2. How the Model “Sees” a Circuit

The model does **not simulate quantum states**.

Instead, it uses **structural proxies**:

### 🔹 Topology Features
- `adj_density` — how densely qubits interact  
- `adj_degree_mean` — average connectivity  
- `adj_degree_std` — variability in connectivity  

→ These reflect **entanglement potential** in the circuit graph.

---

### 🔹 Gate Structure
- `total_gates`
- `single_qubit_gates`
- `two_qubit_gates`
- `cx_count`

→ Two-qubit gates are the **primary drivers of entanglement**.

---

### 🔹 Complexity Metrics
- `depth`
- `gate_entropy`

→ Capture how “deep” and “structured” the circuit is.

---

### 🔹 QASM-derived Signals
- `qasm_length`
- `qasm_line_count`
- `qasm_gate_keyword_count`

→ Lightweight proxies for circuit complexity without parsing semantics.

---

## 🤖 3. Model Overview

The system uses:

### Random Forest Regressor

- Works well on tabular data  
- Handles non-linear relationships  
- Provides feature importance  

Pipeline includes:
- Missing value imputation  
- Feature scaling  
- Ensemble regression  

---

## 📊 4. Understanding the Results

After clicking **"Train & Evaluate"**, you get:

---

### A. Actual vs Predicted

- Each point = one circuit  
- Diagonal line = perfect prediction  

→ The closer to the line → the better  

---

### B. Residual Distribution

- Shows prediction errors  
- Centered around 0 → good model  

→ Wide spread = uncertainty or weak features  

---

### C. Feature Importance

Top contributing features to prediction.

Typical patterns:
- `cx_count` → strong signal  
- `adj_density` → topology influence  
- `depth` → complexity contribution  

---

## 📉 5. Metrics Explained

- **RMSE** — penalizes large errors  
- **MAE** — average absolute error  
- **R²** — goodness of fit (1 = perfect)  

---

## 🧪 6. Experimentation Tips

Try:

- Removing `cx_count` → see how performance drops  
- Using only topology → isolate structural effect  
- Increasing trees → more stable predictions  
- Changing test split → robustness check  

---

## 🔬 7. Key Insight

> Entanglement is not random — it is encoded in circuit structure.

Even without simulation:
- Gate distribution  
- Connectivity  
- Depth  

…already contain enough signal to estimate entanglement.

---

## 🔗 8. Project Resources

- 🤗 Hugging Face: https://huggingface.co/QSBench  
- 💻 GitHub: https://github.com/QSBench  
- 🌐 Website: https://qsbench.github.io