Update README.md

07749bb verified 4 days ago

7.46 kB

	---
	language: en
	license: mit
	tags:
	- qec
	- surface-code
	- quantum
	- pytorch
	- quantum-error-correction
	- neural-decoder
	- bivariate-bicycle
	- ldpc
	- transfer-learning
	- foundation-decoder
	pipeline_tag: other
	---

	# NTU Neural Decoder Checkpoints

	Pre-trained model weights for the neural decoders introduced in **Neural
	Transfer Unification (NTU)**, an architecture-agnostic transfer-learning
	framework for scalable quantum error correction.

	📄 Paper: Efficient Foundation Decoders for Fault-Tolerant Quantum Computing

	🌐 Project page: [https://grahamyan.github.io/ntu-decoder/](https://grahamyan.github.io/ntu-decoder/)

	💻 Source code: [https://github.com/GrahamYan/ntu-decoder](https://github.com/GrahamYan/ntu-decoder)

	---

	## Overview

	NTU exploits the algebraic scale invariance of structured QEC code families to
	transfer error knowledge from small codes to large-scale fault-tolerant regimes,
	eliminating the cold-start optimization barrier. The framework is instantiated
	with two backbone architectures:

	\| Backbone \| Description \| Code families \|
	\|---\|---\|---\|
	\| NTU-Transformer \| Interleaved RNN-Transformer with 2D RoPE and cross-attention readout \| Surface, BB \|
	\| NTU-Neural-BP \| Graph-neural belief propagation on the code Tanner graph \| BB \|

	For planar surface codes under circuit-level depolarizing noise, NTU-Transformer
	surpasses standard PyMatching at d = 25 within a ~10³ GPU-hour training budget.
	For the [[72, 12, 6]] bivariate-bicycle (BB) code, it outperforms BP+OSD across
	all tested physical error rates and is competitive with multi-stage Relay BP.
	Transfer from [[72, 12, 6]] to [[144, 12, 12]] reaches 93.1% block accuracy
	within 2,500 steps (NTU-Transformer) and 95.3% within 500 steps (NTU-Neural-BP).

	---

	## Repository Structure

	```
	ntu-surface-code-decoder/
	├── README.md
	├── surface/ ← Surface code (NTU-Transformer)
	│ ├── d7.pth (121 MB, trained from scratch)
	│ ├── d11.pth (121 MB, transferred from d=7)
	│ ├── d15.pth (121 MB, transferred from d=11)
	│ ├── d19.pth (121 MB, transferred from d=15)
	│ ├── d23.pth (121 MB, transferred from d=19)
	│ └── d25.pth (122 MB, transferred from d=23)
	└── bb/ ← BB code checkpoints
	├── bb72_transformer.pt (138 MB, NTU-Transformer, [[72,12,6]])
	└── neural_bp_bb72.pt (1.2 MB, NTU-Neural-BP, [[72,12,6]])
	```

	Each surface code checkpoint contains `model_state` (OrderedDict of weights),
	`d` (code distance), `rounds` (syndrome extraction rounds), and `step`
	(training step). BB Transformer checkpoints additionally include `block_acc`
	and `output_convention` metadata. NTU-Neural-BP checkpoints store the raw
	`state_dict` directly.

	---

	## Usage

	### NTU-Transformer — Surface code

	```python
	import torch
	from huggingface_hub import hf_hub_download

	ckpt_path = hf_hub_download(
	repo_id="Dreamworldsmile/ntu-surface-code-decoder",
	filename="surface/d7.pth",
	)

	ckpt = torch.load(ckpt_path, map_location="cpu", weights_only=False)
	model.load_state_dict(
	{k.replace("_orig_mod.", "").replace("module.", ""): v
	for k, v in ckpt["model_state"].items()},
	strict=False,
	)
	```

	### NTU-Transformer — BB code

	```python
	ckpt_path = hf_hub_download(
	repo_id="Dreamworldsmile/ntu-surface-code-decoder",
	filename="bb/bb72_transformer.pt",
	)

	ckpt = torch.load(ckpt_path, map_location="cpu")
	state_dict = {k.replace("_orig_mod.", "").replace("module.", ""): v
	for k, v in ckpt["model_state"].items()}
	# Filter to keys present in the target model (skip logical_readout_bias).
	model_sd = model.state_dict()
	filtered = {k: v for k, v in state_dict.items()
	if k in model_sd and model_sd[k].shape == v.shape
	and k != "logical_readout_bias"}
	model.load_state_dict(filtered, strict=False)
	```

	### NTU-Neural-BP — BB code

	```python
	ckpt_path = hf_hub_download(
	repo_id="Dreamworldsmile/ntu-surface-code-decoder",
	filename="bb/neural_bp_bb72.pt",
	)

	ckpt = torch.load(ckpt_path, map_location="cpu", weights_only=True)
	state_dict = {k.replace("module.", ""): v for k, v in ckpt.items()}
	model.load_state_dict(state_dict, strict=True)
	```

	### Inference with the official code

	```bash
	git clone https://github.com/GrahamYan/ntu-decoder.git
	cd ntu-decoder

	# Surface code.
	bash inference.sh --code surface --d 7 \
	--hf_repo Dreamworldsmile/ntu-surface-code-decoder --shots 100000

	# BB code — NTU-Transformer.
	bash inference.sh --code bb --model transformer --block_size 72 \
	--hf_repo Dreamworldsmile/ntu-surface-code-decoder --shots 100000 --p 0.005

	# BB code — NTU-Neural-BP.
	bash inference.sh --code bb --model neural_bp --block_size 72 \
	--hf_repo Dreamworldsmile/ntu-surface-code-decoder --shots 100000 --p 0.005
	```

	---

	## Model Architecture

	### NTU-Transformer

	The Transformer-based decoder combines standard multi-head self-attention
	blocks with two QEC-specific components:

	- Scalable STEM embedding — Encodes syndrome data from variable-size
	lattices into a shared latent representation, absorbing the distance
	dependence into the input encoding (Eq. 2 in the paper).
	- QEC-aware 2D RoPE — Applies rotary position embeddings to relative
	algebraic displacements defined by the code's shift set M(x, y, t; d),
	preserving detector geometry across code distances.
	- Interleaved RNN-Transformer backbone — 5 GRU-based recurrent blocks
	alternating with 6 spatial self-attention blocks.
	- Cross-attention logical readout — Learnable logical query tokens attend
	over the encoded detector representations.

	### NTU-Neural-BP

	A graph-neural-network decoder operating on the bipartite Tanner graph:

	- Message passing between variable and check nodes with gated recurrent
	units (GRU) for message updates.
	- Syndrome-aware encoding of check node states and prior LLRs.
	- Focal loss with syndrome consistency regularization.
	- Compact model (~300K parameters for the [[72, 12, 6]] code).

	---

	## Authors

	[Ge Yan](https://grahamyan.github.io)<sup>1</sup>,
	Shanchuan Li<sup>1, 2</sup>,
	Shiyi Xiao<sup>1, 3</sup>,
	Pengyue Ma<sup>1</sup>,
	Hanyan Cao<sup>4</sup>,
	[Feng Pan](https://scholar.google.com/citations?user=Vp6hFhUAAAAJ)<sup>4,\*</sup>,
	[Yuxuan Du](https://yuxuan-du.github.io)<sup>1,\*</sup>

	<sup>1</sup> College of Computing and Data Science, Nanyang Technological University, Singapore<br>
	<sup>2</sup> Department of Electrical Engineering and Computer Science, Tokyo University of Agriculture and Technology, Japan<br>
	<sup>3</sup> School of Artificial Intelligence, Shanghai Jiao Tong University, China<br>
	<sup>4</sup> Science, Mathematics and Technology Cluster, Singapore University of Technology and Design, Singapore

	<small><sup>\*</sup> Corresponding authors</small>

	---

	## Citation

	```bibtex
	@article{ntu2026,
	title={Efficient Foundation Decoders for Fault-Tolerant Quantum Computing},
	author={Yan, Ge and Li, Shanchuan and Xiao, Shiyi and Ma, Pengyue and
	Cao, Hanyan and Pan, Feng and Du, Yuxuan},
	year={2026},
	}
	```

	---

	## License

	This repository is released under the [MIT License](https://opensource.org/licenses/MIT).