# 🌌 CNOT Count Regression Guide

Welcome to the **CNOT Count Regression Hub**.  
This tool demonstrates how Machine Learning can predict the number of **CNOT (CX) gates** — the most noise-prone two-qubit operations — using only structural features of quantum circuits.

---

## ⚠️ Important: Demo Dataset Notice
The datasets used here are **v1.0.0-demo** shards.

* **Constraint:** Reduced dataset size.
* **Impact:** Model accuracy may fluctuate depending on split and features.
* **Goal:** Demonstrate how circuit topology correlates with entangling gate usage.

---

## 🎯 1. What is Being Predicted?

The model predicts:

### `cx_count`
The total number of **CNOT gates** in a circuit.

Why this matters:

* CNOT gates are the **main source of noise** in NISQ devices
* They dominate **error rates and decoherence**
* Reducing them is key to **hardware-efficient quantum algorithms**

---

## 🧠 2. How the Model Works

We train a **Random Forest Regressor** to map circuit features → `cx_count`.

### Input Features (examples):

* **Topology:**
  * `adj_density` — connectivity density
  * `adj_degree_mean` — average qubit interaction
* **Complexity:**
  * `depth` — circuit depth
  * `total_gates` — total number of operations
* **Structure:**
  * `gate_entropy` — randomness vs regularity
* **QASM-derived:**
  * `qasm_length`, `qasm_line_count`
  * `qasm_gate_keyword_count`

The model learns how **structural patterns imply entangling cost**.

---

## 📊 3. Understanding the Output

After training, you’ll see:

### A. Actual vs Predicted Plot

* Each point = one circuit
* Diagonal line = perfect prediction
* Spread = prediction error

👉 Tight clustering = good model

---

### B. Residual Distribution

* Shows prediction errors (`actual - predicted`)
* Centered around 0 = unbiased model
* Wide spread = instability

---

### C. Feature Importance

Top features driving predictions:

* High importance = strong influence on `cx_count`
* Helps identify:
  * what increases entanglement cost
  * which metrics matter most

---

## 🔍 4. Explorer Tab

Inspect real circuits:

* View dataset slices (`train`, etc.)
* See raw and transpiled QASM
* Understand how circuits differ structurally

---

## ⚙️ 5. Tips for Better Results

* Use **diverse features** (topology + QASM)
* Avoid too small datasets after filtering
* Tune:
  * `max_depth`
  * `n_estimators`
* Try different datasets:
  * Noise changes structure → changes predictions

---

## 🚀 6. Why This Matters

This tool helps answer:

* How expensive is a circuit in terms of **entangling operations**?
* Can we estimate noise **before execution**?
* Which designs are more **hardware-friendly**?

---

## 🔗 7. Project Resources

* 🤗 [Hugging Face](https://huggingface.co/QSBench)
* 💻 [GitHub](https://github.com/QSBench)
* 🌐 [Website](https://qsbench.github.io)