NTU Neural Decoder Checkpoints

Pre-trained model weights for the neural decoders introduced in Neural Transfer Unification (NTU), an architecture-agnostic transfer-learning framework for scalable quantum error correction.

πŸ“„ Paper: Efficient Foundation Decoders for Fault-Tolerant Quantum Computing

🌐 Project page: https://grahamyan.github.io/ntu-decoder/

πŸ’» Source code: https://github.com/GrahamYan/ntu-decoder


Overview

NTU exploits the algebraic scale invariance of structured QEC code families to transfer error knowledge from small codes to large-scale fault-tolerant regimes, eliminating the cold-start optimization barrier. The framework is instantiated with two backbone architectures:

Backbone Description Code families
NTU-Transformer Interleaved RNN-Transformer with 2D RoPE and cross-attention readout Surface, BB
NTU-Neural-BP Graph-neural belief propagation on the code Tanner graph BB

For planar surface codes under circuit-level depolarizing noise, NTU-Transformer surpasses standard PyMatching at d = 25 within a ~10Β³β€―GPU-hour training budget. For the [[72,β€―12,β€―6]] bivariate-bicycle (BB) code, it outperforms BP+OSD across all tested physical error rates and is competitive with multi-stage Relay BP. Transfer from [[72,β€―12,β€―6]] to [[144,β€―12,β€―12]] reaches 93.1% block accuracy within 2,500 steps (NTU-Transformer) and 95.3% within 500 steps (NTU-Neural-BP).


Repository Structure

ntu-surface-code-decoder/
β”œβ”€β”€ README.md
β”œβ”€β”€ surface/                         ← Surface code (NTU-Transformer)
β”‚   β”œβ”€β”€ d7.pth                       (121 MB, trained from scratch)
β”‚   β”œβ”€β”€ d11.pth                      (121 MB, transferred from d=7)
β”‚   β”œβ”€β”€ d15.pth                      (121 MB, transferred from d=11)
β”‚   β”œβ”€β”€ d19.pth                      (121 MB, transferred from d=15)
β”‚   β”œβ”€β”€ d23.pth                      (121 MB, transferred from d=19)
β”‚   └── d25.pth                      (122 MB, transferred from d=23)
└── bb/                              ← BB code checkpoints
    β”œβ”€β”€ bb72_transformer.pt          (138 MB, NTU-Transformer, [[72,12,6]])
    └── neural_bp_bb72.pt            (1.2 MB, NTU-Neural-BP,  [[72,12,6]])

Each surface code checkpoint contains model_state (OrderedDict of weights), d (code distance), rounds (syndrome extraction rounds), and step (training step). BB Transformer checkpoints additionally include block_acc and output_convention metadata. NTU-Neural-BP checkpoints store the raw state_dict directly.


Usage

NTU-Transformer β€” Surface code

import torch
from huggingface_hub import hf_hub_download

ckpt_path = hf_hub_download(
    repo_id="Dreamworldsmile/ntu-surface-code-decoder",
    filename="surface/d7.pth",
)

ckpt = torch.load(ckpt_path, map_location="cpu", weights_only=False)
model.load_state_dict(
    {k.replace("_orig_mod.", "").replace("module.", ""): v
     for k, v in ckpt["model_state"].items()},
    strict=False,
)

NTU-Transformer β€” BB code

ckpt_path = hf_hub_download(
    repo_id="Dreamworldsmile/ntu-surface-code-decoder",
    filename="bb/bb72_transformer.pt",
)

ckpt = torch.load(ckpt_path, map_location="cpu")
state_dict = {k.replace("_orig_mod.", "").replace("module.", ""): v
              for k, v in ckpt["model_state"].items()}
# Filter to keys present in the target model (skip logical_readout_bias).
model_sd = model.state_dict()
filtered = {k: v for k, v in state_dict.items()
            if k in model_sd and model_sd[k].shape == v.shape
            and k != "logical_readout_bias"}
model.load_state_dict(filtered, strict=False)

NTU-Neural-BP β€” BB code

ckpt_path = hf_hub_download(
    repo_id="Dreamworldsmile/ntu-surface-code-decoder",
    filename="bb/neural_bp_bb72.pt",
)

ckpt = torch.load(ckpt_path, map_location="cpu", weights_only=True)
state_dict = {k.replace("module.", ""): v for k, v in ckpt.items()}
model.load_state_dict(state_dict, strict=True)

Inference with the official code

git clone https://github.com/GrahamYan/ntu-decoder.git
cd ntu-decoder

# Surface code.
bash inference.sh --code surface --d 7 \
    --hf_repo Dreamworldsmile/ntu-surface-code-decoder --shots 100000

# BB code β€” NTU-Transformer.
bash inference.sh --code bb --model transformer --block_size 72 \
    --hf_repo Dreamworldsmile/ntu-surface-code-decoder --shots 100000 --p 0.005

# BB code β€” NTU-Neural-BP.
bash inference.sh --code bb --model neural_bp --block_size 72 \
    --hf_repo Dreamworldsmile/ntu-surface-code-decoder --shots 100000 --p 0.005

Model Architecture

NTU-Transformer

The Transformer-based decoder combines standard multi-head self-attention blocks with two QEC-specific components:

  • Scalable STEM embedding β€” Encodes syndrome data from variable-size lattices into a shared latent representation, absorbing the distance dependence into the input encoding (Eq.β€―2 in the paper).
  • QEC-aware 2D RoPE β€” Applies rotary position embeddings to relative algebraic displacements defined by the code's shift set M(x,β€―y,β€―t;β€―d), preserving detector geometry across code distances.
  • Interleaved RNN-Transformer backbone β€” 5 GRU-based recurrent blocks alternating with 6 spatial self-attention blocks.
  • Cross-attention logical readout β€” Learnable logical query tokens attend over the encoded detector representations.

NTU-Neural-BP

A graph-neural-network decoder operating on the bipartite Tanner graph:

  • Message passing between variable and check nodes with gated recurrent units (GRU) for message updates.
  • Syndrome-aware encoding of check node states and prior LLRs.
  • Focal loss with syndrome consistency regularization.
  • Compact model (~300K parameters for the [[72,β€―12,β€―6]] code).

Authors

Ge Yan1, Shanchuan Li1,β€―2, Shiyi Xiao1,β€―3, Pengyue Ma1, Hanyan Cao4, Feng Pan4,*, Yuxuan Du1,*

1 College of Computing and Data Science, Nanyang Technological University, Singapore
2 Department of Electrical Engineering and Computer Science, Tokyo University of Agriculture and Technology, Japan
3 School of Artificial Intelligence, Shanghai Jiao Tong University, China
4 Science, Mathematics and Technology Cluster, Singapore University of Technology and Design, Singapore

* Corresponding authors


Citation

@article{ntu2026,
  title={Efficient Foundation Decoders for Fault-Tolerant Quantum Computing},
  author={Yan, Ge and Li, Shanchuan and Xiao, Shiyi and Ma, Pengyue and
          Cao, Hanyan and Pan, Feng and Du, Yuxuan},
  year={2026},
}

License

This repository is released under the MIT License.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support