# 🌌 QSBench: Complete User Guide Welcome to the **QSBench Analytics Hub**. This platform is designed to bridge the gap between quantum circuit topology and machine learning, allowing researchers to study how structural characteristics influence quantum simulation outcomes. --- ## ⚠️ Important: Demo Dataset Notice The datasets currently loaded in this hub are **v1.0.0-demo versions**. - **Scale**: These are small *shards* (subsets) of the full QSBench library. - **Accuracy**: Because the training data is limited in size, ML models trained here will show lower accuracy and higher variance compared to models trained on full-scale production datasets. - **Purpose**: These sets are intended for **demonstration and prototyping** of analytical pipelines before moving to high-performance computing (HPC) environments. --- ## πŸ“‚ 1. Dataset Architecture & Selection QSBench provides high-fidelity simulation data for the Quantum Machine Learning (QML) community. We provide four distinct environments to test how different noise models affect data: ### Core (Clean) Ideal state-vector simulations. Used as a **"Golden Reference"** to understand the theoretical limits of a circuit's expressivity without physical interference. ### Depolarizing Noise Simulates the effect of qubits losing their state toward a maximally mixed state. This is the standard **"white noise"** of quantum computing. ### Amplitude Damping Represents **T1 relaxation (energy loss)**. This is an asymmetric noise model where qubits decay from ∣1⟩ to ∣0⟩, critical for studying superconducting hardware. ### Transpilation (10q) Circuits are mapped to a **hardware topology (heavy-hex or grid)**. Used to study how SWAP gates and routing overhead affect final results. --- ## πŸ“Š 2. Feature Engineering: Structural Metrics Why do we extract these specific features? In QML, the **structure ("shape") of a circuit directly impacts performance**. - **gate_entropy** Measures distribution of gates. High entropy β†’ complex, less repetitive circuits β†’ harder for classical models to learn. - **meyer_wallach** Quantifies **global entanglement**. Entanglement provides quantum advantage but increases sensitivity to noise. - **adjacency** Represents qubit interaction graph density. High adjacency β†’ faster information spread, but higher risk of cross-talk errors. - **cx_count (Two-Qubit Gates)** The most critical complexity metric. On NISQ devices, CNOT gates are **10x–100x noisier** than single-qubit gates. **Note on Feature Correlation:** While structural metrics (like `gate_entropy` or `depth`) describe the complexity of the circuit, they do not encode the specific rotation angles of individual gates. Therefore, predicting the exact expectation value using only structural features is an **extremely challenging task** (Non-Trivial Mapping). --- ## 🎯 3. Multi-Target Regression (The Bloch Vector) Unlike traditional benchmarks that focus on a single observable, QSBench targets the **full global Bloch vector**: [⟨X⟩global, ⟨Y⟩global, ⟨Z⟩global] ```text | +Z (0) | -----|---- +Y /| / | -Z (1) +X ``` --- ### Why predict all three? A quantum state is a point on (or inside) the **Bloch sphere**. - Predicting only Z gives an incomplete picture - Multi-target regression learns correlations between: - circuit structure - full quantum state orientation - behavior in Hilbert space --- ## πŸ€– 4. Using the ML Analytics Module The Hub uses a **Random Forest Regressor** to establish a baseline of predictability. ### Workflow 1. **Select Dataset** Choose a noise model and observe how it affects predictability. 2. **Select Features** Recommended starting set: - `gate_entropy` - `meyer_wallach` - `depth` - `cx_count` 3. **Execute Baseline** Performs an **80/20 train-test split**. 4. **Analyze the Triple Parity Plot** - πŸ”΄ **Diagonal Red Line** β†’ perfect prediction - πŸ“ˆ **Clustering near line** β†’ strong predictive signal - πŸ” **Basis comparison**: - Z often easier to predict - X/Y depend more on circuit structure - reveals architectural biases (HEA vs QFT, etc.) πŸ“‰ **How to Interpret "Bad" Metrics?** If you see a **negative** R2 or clustering around zero, don't panic. This is the expected behavior for standard regression on quantum data: - **Mean Predictor Baseline:** In complex circuits (n=8, depth=6), expectation values naturally concentrate around 0. A model that simply predicts "0" for everything will have a low MAE but a zero/negative R2. - **The Complexity Gap:** A negative R2 proves that the relationship between circuit shape and quantum output is highly non-linear. - **Research Challenge:** Use these baseline results to justify the need for more advanced architectures like **Graph Neural Networks (GNNs)** or **Recursive Quantum Filters** that can process the gate sequence itself. --- ## πŸ”— 5. Project Resources - πŸ€— Hugging Face Datasets β€” download dataset shards - πŸ’» GitHub Repository β€” QSBench generator source code - 🌐 Official Website β€” documentation and benchmarking leaderboards --- *QSBench β€” Synthetic Quantum Dataset Benchmarks*