NTU Neural Decoder Checkpoints

Pre-trained model weights for the neural decoders introduced in Neural Transfer Unification (NTU), an architecture-agnostic transfer-learning framework for scalable quantum error correction.

📄 Paper: Efficient Foundation Decoders for Fault-Tolerant Quantum Computing

🌐 Project page: https://grahamyan.github.io/ntu-decoder/

💻 Source code: https://github.com/GrahamYan/ntu-decoder

Overview

NTU exploits the algebraic scale invariance of structured QEC code families to transfer error knowledge from small codes to large-scale fault-tolerant regimes, eliminating the cold-start optimization barrier. The framework is instantiated with two backbone architectures:

Backbone	Description	Code families
NTU-Transformer	Interleaved RNN-Transformer with 2D RoPE and cross-attention readout	Surface, BB
NTU-Neural-BP	Graph-neural belief propagation on the code Tanner graph	BB

For planar surface codes under circuit-level depolarizing noise, NTU-Transformer surpasses standard PyMatching at d = 25 within a ~10³ GPU-hour training budget. For the [[72, 12, 6]] bivariate-bicycle (BB) code, it outperforms BP+OSD across all tested physical error rates and is competitive with multi-stage Relay BP. Transfer from [[72, 12, 6]] to [[144, 12, 12]] reaches 93.1% block accuracy within 2,500 steps (NTU-Transformer) and 95.3% within 500 steps (NTU-Neural-BP).

Repository Structure

ntu-surface-code-decoder/
├── README.md
├── surface/                         ← Surface code (NTU-Transformer)
│   ├── d7.pth                       (121 MB, trained from scratch)
│   ├── d11.pth                      (121 MB, transferred from d=7)
│   ├── d15.pth                      (121 MB, transferred from d=11)
│   ├── d19.pth                      (121 MB, transferred from d=15)
│   ├── d23.pth                      (121 MB, transferred from d=19)
│   └── d25.pth                      (122 MB, transferred from d=23)
└── bb/                              ← BB code checkpoints
    ├── bb72_transformer.pt          (138 MB, NTU-Transformer, [[72,12,6]])
    └── neural_bp_bb72.pt            (1.2 MB, NTU-Neural-BP,  [[72,12,6]])

Each surface code checkpoint contains model_state (OrderedDict of weights), d (code distance), rounds (syndrome extraction rounds), and step (training step). BB Transformer checkpoints additionally include block_acc and output_convention metadata. NTU-Neural-BP checkpoints store the raw state_dict directly.

Usage

NTU-Transformer — Surface code

import torch
from huggingface_hub import hf_hub_download

ckpt_path = hf_hub_download(
    repo_id="Dreamworldsmile/ntu-surface-code-decoder",
    filename="surface/d7.pth",
)

ckpt = torch.load(ckpt_path, map_location="cpu", weights_only=False)
model.load_state_dict(
    {k.replace("_orig_mod.", "").replace("module.", ""): v
     for k, v in ckpt["model_state"].items()},
    strict=False,
)

NTU-Transformer — BB code

ckpt_path = hf_hub_download(
    repo_id="Dreamworldsmile/ntu-surface-code-decoder",
    filename="bb/bb72_transformer.pt",
)

ckpt = torch.load(ckpt_path, map_location="cpu")
state_dict = {k.replace("_orig_mod.", "").replace("module.", ""): v
              for k, v in ckpt["model_state"].items()}
# Filter to keys present in the target model (skip logical_readout_bias).
model_sd = model.state_dict()
filtered = {k: v for k, v in state_dict.items()
            if k in model_sd and model_sd[k].shape == v.shape
            and k != "logical_readout_bias"}
model.load_state_dict(filtered, strict=False)

NTU-Neural-BP — BB code

ckpt_path = hf_hub_download(
    repo_id="Dreamworldsmile/ntu-surface-code-decoder",
    filename="bb/neural_bp_bb72.pt",
)

ckpt = torch.load(ckpt_path, map_location="cpu", weights_only=True)
state_dict = {k.replace("module.", ""): v for k, v in ckpt.items()}
model.load_state_dict(state_dict, strict=True)

Inference with the official code

git clone https://github.com/GrahamYan/ntu-decoder.git
cd ntu-decoder

# Surface code.
bash inference.sh --code surface --d 7 \
    --hf_repo Dreamworldsmile/ntu-surface-code-decoder --shots 100000

# BB code — NTU-Transformer.
bash inference.sh --code bb --model transformer --block_size 72 \
    --hf_repo Dreamworldsmile/ntu-surface-code-decoder --shots 100000 --p 0.005

# BB code — NTU-Neural-BP.
bash inference.sh --code bb --model neural_bp --block_size 72 \
    --hf_repo Dreamworldsmile/ntu-surface-code-decoder --shots 100000 --p 0.005

Model Architecture

NTU-Transformer

The Transformer-based decoder combines standard multi-head self-attention blocks with two QEC-specific components:

Scalable STEM embedding — Encodes syndrome data from variable-size lattices into a shared latent representation, absorbing the distance dependence into the input encoding (Eq. 2 in the paper).
QEC-aware 2D RoPE — Applies rotary position embeddings to relative algebraic displacements defined by the code's shift set M(x, y, t; d), preserving detector geometry across code distances.
Interleaved RNN-Transformer backbone — 5 GRU-based recurrent blocks alternating with 6 spatial self-attention blocks.
Cross-attention logical readout — Learnable logical query tokens attend over the encoded detector representations.

NTU-Neural-BP

A graph-neural-network decoder operating on the bipartite Tanner graph:

Message passing between variable and check nodes with gated recurrent units (GRU) for message updates.
Syndrome-aware encoding of check node states and prior LLRs.
Focal loss with syndrome consistency regularization.
Compact model (~300K parameters for the [[72, 12, 6]] code).

Authors

Ge Yan¹, Shanchuan Li^1, 2, Shiyi Xiao^1, 3, Pengyue Ma¹, Hanyan Cao⁴, Feng Pan^4,*, Yuxuan Du^1,*

¹ College of Computing and Data Science, Nanyang Technological University, Singapore
² Department of Electrical Engineering and Computer Science, Tokyo University of Agriculture and Technology, Japan
³ School of Artificial Intelligence, Shanghai Jiao Tong University, China
⁴ Science, Mathematics and Technology Cluster, Singapore University of Technology and Design, Singapore

^* Corresponding authors

Citation

@article{ntu2026,
  title={Efficient Foundation Decoders for Fault-Tolerant Quantum Computing},
  author={Yan, Ge and Li, Shanchuan and Xiao, Shiyi and Ma, Pengyue and
          Cao, Hanyan and Pan, Feng and Du, Yuxuan},
  year={2026},
}

License

This repository is released under the MIT License.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

Other

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support