Dreamworldsmile's picture
Update README.md
07749bb verified
|
Raw
History Blame Contribute Delete
7.46 kB
---
language: en
license: mit
tags:
- qec
- surface-code
- quantum
- pytorch
- quantum-error-correction
- neural-decoder
- bivariate-bicycle
- ldpc
- transfer-learning
- foundation-decoder
pipeline_tag: other
---
# NTU Neural Decoder Checkpoints
Pre-trained model weights for the neural decoders introduced in **Neural
Transfer Unification (NTU)**, an architecture-agnostic transfer-learning
framework for scalable quantum error correction.
πŸ“„ **Paper**: *Efficient Foundation Decoders for Fault-Tolerant Quantum Computing*
🌐 **Project page**: [https://grahamyan.github.io/ntu-decoder/](https://grahamyan.github.io/ntu-decoder/)
πŸ’» **Source code**: [https://github.com/GrahamYan/ntu-decoder](https://github.com/GrahamYan/ntu-decoder)
---
## Overview
NTU exploits the algebraic scale invariance of structured QEC code families to
transfer error knowledge from small codes to large-scale fault-tolerant regimes,
eliminating the cold-start optimization barrier. The framework is instantiated
with two backbone architectures:
| Backbone | Description | Code families |
|---|---|---|
| **NTU-Transformer** | Interleaved RNN-Transformer with 2D RoPE and cross-attention readout | Surface, BB |
| **NTU-Neural-BP** | Graph-neural belief propagation on the code Tanner graph | BB |
For planar surface codes under circuit-level depolarizing noise, NTU-Transformer
surpasses standard PyMatching at *d* = 25 within a ~10Β³β€―GPU-hour training budget.
For the [[72,β€―12,β€―6]] bivariate-bicycle (BB) code, it outperforms BP+OSD across
all tested physical error rates and is competitive with multi-stage Relay BP.
Transfer from [[72,β€―12,β€―6]] to [[144,β€―12,β€―12]] reaches 93.1% block accuracy
within 2,500 steps (NTU-Transformer) and 95.3% within 500 steps (NTU-Neural-BP).
---
## Repository Structure
```
ntu-surface-code-decoder/
β”œβ”€β”€ README.md
β”œβ”€β”€ surface/ ← Surface code (NTU-Transformer)
β”‚ β”œβ”€β”€ d7.pth (121 MB, trained from scratch)
β”‚ β”œβ”€β”€ d11.pth (121 MB, transferred from d=7)
β”‚ β”œβ”€β”€ d15.pth (121 MB, transferred from d=11)
β”‚ β”œβ”€β”€ d19.pth (121 MB, transferred from d=15)
β”‚ β”œβ”€β”€ d23.pth (121 MB, transferred from d=19)
β”‚ └── d25.pth (122 MB, transferred from d=23)
└── bb/ ← BB code checkpoints
β”œβ”€β”€ bb72_transformer.pt (138 MB, NTU-Transformer, [[72,12,6]])
└── neural_bp_bb72.pt (1.2 MB, NTU-Neural-BP, [[72,12,6]])
```
Each surface code checkpoint contains `model_state` (OrderedDict of weights),
`d` (code distance), `rounds` (syndrome extraction rounds), and `step`
(training step). BB Transformer checkpoints additionally include `block_acc`
and `output_convention` metadata. NTU-Neural-BP checkpoints store the raw
`state_dict` directly.
---
## Usage
### NTU-Transformer β€” Surface code
```python
import torch
from huggingface_hub import hf_hub_download
ckpt_path = hf_hub_download(
repo_id="Dreamworldsmile/ntu-surface-code-decoder",
filename="surface/d7.pth",
)
ckpt = torch.load(ckpt_path, map_location="cpu", weights_only=False)
model.load_state_dict(
{k.replace("_orig_mod.", "").replace("module.", ""): v
for k, v in ckpt["model_state"].items()},
strict=False,
)
```
### NTU-Transformer β€” BB code
```python
ckpt_path = hf_hub_download(
repo_id="Dreamworldsmile/ntu-surface-code-decoder",
filename="bb/bb72_transformer.pt",
)
ckpt = torch.load(ckpt_path, map_location="cpu")
state_dict = {k.replace("_orig_mod.", "").replace("module.", ""): v
for k, v in ckpt["model_state"].items()}
# Filter to keys present in the target model (skip logical_readout_bias).
model_sd = model.state_dict()
filtered = {k: v for k, v in state_dict.items()
if k in model_sd and model_sd[k].shape == v.shape
and k != "logical_readout_bias"}
model.load_state_dict(filtered, strict=False)
```
### NTU-Neural-BP β€” BB code
```python
ckpt_path = hf_hub_download(
repo_id="Dreamworldsmile/ntu-surface-code-decoder",
filename="bb/neural_bp_bb72.pt",
)
ckpt = torch.load(ckpt_path, map_location="cpu", weights_only=True)
state_dict = {k.replace("module.", ""): v for k, v in ckpt.items()}
model.load_state_dict(state_dict, strict=True)
```
### Inference with the official code
```bash
git clone https://github.com/GrahamYan/ntu-decoder.git
cd ntu-decoder
# Surface code.
bash inference.sh --code surface --d 7 \
--hf_repo Dreamworldsmile/ntu-surface-code-decoder --shots 100000
# BB code β€” NTU-Transformer.
bash inference.sh --code bb --model transformer --block_size 72 \
--hf_repo Dreamworldsmile/ntu-surface-code-decoder --shots 100000 --p 0.005
# BB code β€” NTU-Neural-BP.
bash inference.sh --code bb --model neural_bp --block_size 72 \
--hf_repo Dreamworldsmile/ntu-surface-code-decoder --shots 100000 --p 0.005
```
---
## Model Architecture
### NTU-Transformer
The Transformer-based decoder combines standard multi-head self-attention
blocks with two QEC-specific components:
- **Scalable STEM embedding** β€” Encodes syndrome data from variable-size
lattices into a shared latent representation, absorbing the distance
dependence into the input encoding (Eq.β€―2 in the paper).
- **QEC-aware 2D RoPE** β€” Applies rotary position embeddings to relative
algebraic displacements defined by the code's shift set *M*(*x*,β€―*y*,β€―*t*;β€―*d*),
preserving detector geometry across code distances.
- **Interleaved RNN-Transformer backbone** β€” 5 GRU-based recurrent blocks
alternating with 6 spatial self-attention blocks.
- **Cross-attention logical readout** β€” Learnable logical query tokens attend
over the encoded detector representations.
### NTU-Neural-BP
A graph-neural-network decoder operating on the bipartite Tanner graph:
- **Message passing** between variable and check nodes with gated recurrent
units (GRU) for message updates.
- **Syndrome-aware encoding** of check node states and prior LLRs.
- **Focal loss** with syndrome consistency regularization.
- Compact model (~300K parameters for the [[72,β€―12,β€―6]] code).
---
## Authors
[Ge Yan](https://grahamyan.github.io)<sup>1</sup>,
Shanchuan Li<sup>1,β€―2</sup>,
Shiyi Xiao<sup>1,β€―3</sup>,
Pengyue Ma<sup>1</sup>,
Hanyan Cao<sup>4</sup>,
[Feng Pan](https://scholar.google.com/citations?user=Vp6hFhUAAAAJ)<sup>4,\*</sup>,
[Yuxuan Du](https://yuxuan-du.github.io)<sup>1,\*</sup>
<sup>1</sup> College of Computing and Data Science, Nanyang Technological University, Singapore<br>
<sup>2</sup> Department of Electrical Engineering and Computer Science, Tokyo University of Agriculture and Technology, Japan<br>
<sup>3</sup> School of Artificial Intelligence, Shanghai Jiao Tong University, China<br>
<sup>4</sup> Science, Mathematics and Technology Cluster, Singapore University of Technology and Design, Singapore
<small><sup>\*</sup> Corresponding authors</small>
---
## Citation
```bibtex
@article{ntu2026,
title={Efficient Foundation Decoders for Fault-Tolerant Quantum Computing},
author={Yan, Ge and Li, Shanchuan and Xiao, Shiyi and Ma, Pengyue and
Cao, Hanyan and Pan, Feng and Du, Yuxuan},
year={2026},
}
```
---
## License
This repository is released under the [MIT License](https://opensource.org/licenses/MIT).