Spaces:

QSBench
/

Multi-Target_Regression

Running

App Files Files Community

Multi-Target_Regression / GUIDE.md

QSBench

Update GUIDE.md

ab598e2 verified about 22 hours ago

preview code

raw

history blame contribute delete

5.32 kB

	# 🌌 QSBench: Complete User Guide

	Welcome to the QSBench Analytics Hub.
	This platform is designed to bridge the gap between quantum circuit topology and machine learning, allowing researchers to study how structural characteristics influence quantum simulation outcomes.

	---

	## ⚠️ Important: Demo Dataset Notice

	The datasets currently loaded in this hub are v1.0.0-demo versions.

	- Scale: These are small shards (subsets) of the full QSBench library.
	- Accuracy: Because the training data is limited in size, ML models trained here will show lower accuracy and higher variance compared to models trained on full-scale production datasets.
	- Purpose: These sets are intended for demonstration and prototyping of analytical pipelines before moving to high-performance computing (HPC) environments.

	---

	## 📂 1. Dataset Architecture & Selection

	QSBench provides high-fidelity simulation data for the Quantum Machine Learning (QML) community.
	We provide four distinct environments to test how different noise models affect data:

	### Core (Clean)
	Ideal state-vector simulations.
	Used as a "Golden Reference" to understand the theoretical limits of a circuit's expressivity without physical interference.

	### Depolarizing Noise
	Simulates the effect of qubits losing their state toward a maximally mixed state.
	This is the standard "white noise" of quantum computing.

	### Amplitude Damping
	Represents T1 relaxation (energy loss).
	This is an asymmetric noise model where qubits decay from ∣1⟩ to ∣0⟩, critical for studying superconducting hardware.

	### Transpilation (10q)
	Circuits are mapped to a hardware topology (heavy-hex or grid).
	Used to study how SWAP gates and routing overhead affect final results.

	---

	## 📊 2. Feature Engineering: Structural Metrics

	Why do we extract these specific features?
	In QML, the structure ("shape") of a circuit directly impacts performance.

	- gate_entropy
	Measures distribution of gates.
	High entropy → complex, less repetitive circuits → harder for classical models to learn.

	- meyer_wallach
	Quantifies global entanglement.
	Entanglement provides quantum advantage but increases sensitivity to noise.

	- adjacency
	Represents qubit interaction graph density.
	High adjacency → faster information spread, but higher risk of cross-talk errors.

	- cx_count (Two-Qubit Gates)
	The most critical complexity metric.
	On NISQ devices, CNOT gates are 10x–100x noisier than single-qubit gates.

	Note on Feature Correlation: While structural metrics (like `gate_entropy` or `depth`) describe the complexity of the circuit, they do not encode the specific rotation angles of individual gates.
	Therefore, predicting the exact expectation value using only structural features is an extremely challenging task (Non-Trivial Mapping).

	---

	## 🎯 3. Multi-Target Regression (The Bloch Vector)

	Unlike traditional benchmarks that focus on a single observable, QSBench targets the full global Bloch vector:

	[⟨X⟩global, ⟨Y⟩global, ⟨Z⟩global]

	```text
	\| +Z (0)
	\|
	-----\|---- +Y
	/\|
	/ \| -Z (1)
	+X
	```

	---

	### Why predict all three?

	A quantum state is a point on (or inside) the Bloch sphere.

	- Predicting only Z gives an incomplete picture
	- Multi-target regression learns correlations between:
	- circuit structure
	- full quantum state orientation
	- behavior in Hilbert space

	---

	## 🤖 4. Using the ML Analytics Module

	The Hub uses a Random Forest Regressor to establish a baseline of predictability.

	### Workflow

	1. Select Dataset
	Choose a noise model and observe how it affects predictability.

	2. Select Features
	Recommended starting set:
	- `gate_entropy`
	- `meyer_wallach`
	- `depth`
	- `cx_count`

	3. Execute Baseline
	Performs an 80/20 train-test split.

	4. Analyze the Triple Parity Plot

	- 🔴 Diagonal Red Line → perfect prediction
	- 📈 Clustering near line → strong predictive signal
	- 🔍 Basis comparison:
	- Z often easier to predict
	- X/Y depend more on circuit structure
	- reveals architectural biases (HEA vs QFT, etc.)

	📉 How to Interpret "Bad" Metrics?

	If you see a negative R2 or clustering around zero, don't panic. This is the expected behavior for standard regression on quantum data:

	- Mean Predictor Baseline: In complex circuits (n=8, depth=6), expectation values naturally concentrate around 0. A model that simply predicts "0" for everything will have a low MAE but a zero/negative R2.

	- The Complexity Gap: A negative R2 proves that the relationship between circuit shape and quantum output is highly non-linear.

	- Research Challenge: Use these baseline results to justify the need for more advanced architectures like Graph Neural Networks (GNNs) or Recursive Quantum Filters that can process the gate sequence itself.

	---

	## 🔗 5. Project Resources

	- 🤗 Hugging Face Datasets — download dataset shards
	- 💻 GitHub Repository — QSBench generator source code
	- 🌐 Official Website — documentation and benchmarking leaderboards

	---

	QSBench — Synthetic Quantum Dataset Benchmarks