language: en
license: mit
tags:
- qec
- surface-code
- quantum
- pytorch
- quantum-error-correction
- neural-decoder
- bivariate-bicycle
- ldpc
- transfer-learning
- foundation-decoder
pipeline_tag: other
NTU Neural Decoder Checkpoints
Pre-trained model weights for the neural decoders introduced in Neural Transfer Unification (NTU), an architecture-agnostic transfer-learning framework for scalable quantum error correction.
π Paper: Efficient Foundation Decoders for Fault-Tolerant Quantum Computing
π Project page: https://grahamyan.github.io/ntu-decoder/
π» Source code: https://github.com/GrahamYan/ntu-decoder
Overview
NTU exploits the algebraic scale invariance of structured QEC code families to transfer error knowledge from small codes to large-scale fault-tolerant regimes, eliminating the cold-start optimization barrier. The framework is instantiated with two backbone architectures:
| Backbone | Description | Code families |
|---|---|---|
| NTU-Transformer | Interleaved RNN-Transformer with 2D RoPE and cross-attention readout | Surface, BB |
| NTU-Neural-BP | Graph-neural belief propagation on the code Tanner graph | BB |
For planar surface codes under circuit-level depolarizing noise, NTU-Transformer surpasses standard PyMatching at d = 25 within a ~10Β³β―GPU-hour training budget. For the [[72,β―12,β―6]] bivariate-bicycle (BB) code, it outperforms BP+OSD across all tested physical error rates and is competitive with multi-stage Relay BP. Transfer from [[72,β―12,β―6]] to [[144,β―12,β―12]] reaches 93.1% block accuracy within 2,500 steps (NTU-Transformer) and 95.3% within 500 steps (NTU-Neural-BP).
Repository Structure
ntu-surface-code-decoder/
βββ README.md
βββ surface/ β Surface code (NTU-Transformer)
β βββ d7.pth (121 MB, trained from scratch)
β βββ d11.pth (121 MB, transferred from d=7)
β βββ d15.pth (121 MB, transferred from d=11)
β βββ d19.pth (121 MB, transferred from d=15)
β βββ d23.pth (121 MB, transferred from d=19)
β βββ d25.pth (122 MB, transferred from d=23)
βββ bb/ β BB code checkpoints
βββ bb72_transformer.pt (138 MB, NTU-Transformer, [[72,12,6]])
βββ neural_bp_bb72.pt (1.2 MB, NTU-Neural-BP, [[72,12,6]])
Each surface code checkpoint contains model_state (OrderedDict of weights),
d (code distance), rounds (syndrome extraction rounds), and step
(training step). BB Transformer checkpoints additionally include block_acc
and output_convention metadata. NTU-Neural-BP checkpoints store the raw
state_dict directly.
Usage
NTU-Transformer β Surface code
import torch
from huggingface_hub import hf_hub_download
ckpt_path = hf_hub_download(
repo_id="Dreamworldsmile/ntu-surface-code-decoder",
filename="surface/d7.pth",
)
ckpt = torch.load(ckpt_path, map_location="cpu", weights_only=False)
model.load_state_dict(
{k.replace("_orig_mod.", "").replace("module.", ""): v
for k, v in ckpt["model_state"].items()},
strict=False,
)
NTU-Transformer β BB code
ckpt_path = hf_hub_download(
repo_id="Dreamworldsmile/ntu-surface-code-decoder",
filename="bb/bb72_transformer.pt",
)
ckpt = torch.load(ckpt_path, map_location="cpu")
state_dict = {k.replace("_orig_mod.", "").replace("module.", ""): v
for k, v in ckpt["model_state"].items()}
# Filter to keys present in the target model (skip logical_readout_bias).
model_sd = model.state_dict()
filtered = {k: v for k, v in state_dict.items()
if k in model_sd and model_sd[k].shape == v.shape
and k != "logical_readout_bias"}
model.load_state_dict(filtered, strict=False)
NTU-Neural-BP β BB code
ckpt_path = hf_hub_download(
repo_id="Dreamworldsmile/ntu-surface-code-decoder",
filename="bb/neural_bp_bb72.pt",
)
ckpt = torch.load(ckpt_path, map_location="cpu", weights_only=True)
state_dict = {k.replace("module.", ""): v for k, v in ckpt.items()}
model.load_state_dict(state_dict, strict=True)
Inference with the official code
git clone https://github.com/GrahamYan/ntu-decoder.git
cd ntu-decoder
# Surface code.
bash inference.sh --code surface --d 7 \
--hf_repo Dreamworldsmile/ntu-surface-code-decoder --shots 100000
# BB code β NTU-Transformer.
bash inference.sh --code bb --model transformer --block_size 72 \
--hf_repo Dreamworldsmile/ntu-surface-code-decoder --shots 100000 --p 0.005
# BB code β NTU-Neural-BP.
bash inference.sh --code bb --model neural_bp --block_size 72 \
--hf_repo Dreamworldsmile/ntu-surface-code-decoder --shots 100000 --p 0.005
Model Architecture
NTU-Transformer
The Transformer-based decoder combines standard multi-head self-attention blocks with two QEC-specific components:
- Scalable STEM embedding β Encodes syndrome data from variable-size lattices into a shared latent representation, absorbing the distance dependence into the input encoding (Eq.β―2 in the paper).
- QEC-aware 2D RoPE β Applies rotary position embeddings to relative algebraic displacements defined by the code's shift set M(x,β―y,β―t;β―d), preserving detector geometry across code distances.
- Interleaved RNN-Transformer backbone β 5 GRU-based recurrent blocks alternating with 6 spatial self-attention blocks.
- Cross-attention logical readout β Learnable logical query tokens attend over the encoded detector representations.
NTU-Neural-BP
A graph-neural-network decoder operating on the bipartite Tanner graph:
- Message passing between variable and check nodes with gated recurrent units (GRU) for message updates.
- Syndrome-aware encoding of check node states and prior LLRs.
- Focal loss with syndrome consistency regularization.
- Compact model (~300K parameters for the [[72,β―12,β―6]] code).
Authors
Ge Yan1, Shanchuan Li1,β―2, Shiyi Xiao1,β―3, Pengyue Ma1, Hanyan Cao4, Feng Pan4,*, Yuxuan Du1,*
1 College of Computing and Data Science, Nanyang Technological University, Singapore
2 Department of Electrical Engineering and Computer Science, Tokyo University of Agriculture and Technology, Japan
3 School of Artificial Intelligence, Shanghai Jiao Tong University, China
4 Science, Mathematics and Technology Cluster, Singapore University of Technology and Design, Singapore
* Corresponding authors
Citation
@article{ntu2026,
title={Efficient Foundation Decoders for Fault-Tolerant Quantum Computing},
author={Yan, Ge and Li, Shanchuan and Xiao, Shiyi and Ma, Pengyue and
Cao, Hanyan and Pan, Feng and Du, Yuxuan},
year={2026},
}
License
This repository is released under the MIT License.