QSBench's picture
Create GUIDE.md
85ae166 verified
# 🌌 QSBench: Entanglement Score Regression Guide
Welcome to the **QSBench Regression Hub**.
This tool demonstrates how Machine Learning can predict the **degree of quantum entanglement** — measured by the **Meyer–Wallach score** — using only circuit structure and topology.
---
## ⚠️ Important: Demo Dataset Notice
This Space uses **demo shards** of the QSBench datasets.
- **Limited size:** The dataset is intentionally reduced.
- **Impact:** Model performance may be unstable or noisy.
- **Goal:** Showcase how structural features correlate with entanglement — not achieve production-level accuracy.
---
## 🧠 1. What is Being Predicted?
The model predicts:
### `meyer_wallach`
A continuous entanglement measure:
- **0.0 → No entanglement**
- **1.0 → Maximum entanglement**
This metric captures how strongly qubits are globally correlated in a circuit.
---
## 🧩 2. How the Model “Sees” a Circuit
The model does **not simulate quantum states**.
Instead, it uses **structural proxies**:
### 🔹 Topology Features
- `adj_density` — how densely qubits interact
- `adj_degree_mean` — average connectivity
- `adj_degree_std` — variability in connectivity
→ These reflect **entanglement potential** in the circuit graph.
---
### 🔹 Gate Structure
- `total_gates`
- `single_qubit_gates`
- `two_qubit_gates`
- `cx_count`
→ Two-qubit gates are the **primary drivers of entanglement**.
---
### 🔹 Complexity Metrics
- `depth`
- `gate_entropy`
→ Capture how “deep” and “structured” the circuit is.
---
### 🔹 QASM-derived Signals
- `qasm_length`
- `qasm_line_count`
- `qasm_gate_keyword_count`
→ Lightweight proxies for circuit complexity without parsing semantics.
---
## 🤖 3. Model Overview
The system uses:
### Random Forest Regressor
- Works well on tabular data
- Handles non-linear relationships
- Provides feature importance
Pipeline includes:
- Missing value imputation
- Feature scaling
- Ensemble regression
---
## 📊 4. Understanding the Results
After clicking **"Train & Evaluate"**, you get:
---
### A. Actual vs Predicted
- Each point = one circuit
- Diagonal line = perfect prediction
→ The closer to the line → the better
---
### B. Residual Distribution
- Shows prediction errors
- Centered around 0 → good model
→ Wide spread = uncertainty or weak features
---
### C. Feature Importance
Top contributing features to prediction.
Typical patterns:
- `cx_count` → strong signal
- `adj_density` → topology influence
- `depth` → complexity contribution
---
## 📉 5. Metrics Explained
- **RMSE** — penalizes large errors
- **MAE** — average absolute error
- **R²** — goodness of fit (1 = perfect)
---
## 🧪 6. Experimentation Tips
Try:
- Removing `cx_count` → see how performance drops
- Using only topology → isolate structural effect
- Increasing trees → more stable predictions
- Changing test split → robustness check
---
## 🔬 7. Key Insight
> Entanglement is not random — it is encoded in circuit structure.
Even without simulation:
- Gate distribution
- Connectivity
- Depth
…already contain enough signal to estimate entanglement.
---
## 🔗 8. Project Resources
- 🤗 Hugging Face: https://huggingface.co/QSBench
- 💻 GitHub: https://github.com/QSBench
- 🌐 Website: https://qsbench.github.io