Dreamworldsmile commited on
Commit
464eec5
Β·
verified Β·
1 Parent(s): ad3f647

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +170 -39
README.md CHANGED
@@ -2,66 +2,104 @@
2
  language: en
3
  license: mit
4
  tags:
5
- - qec
6
- - surface-code
7
- - quantum
8
- - pytorch
9
- - quantum-error-correction
10
- - neural-decoder
 
 
11
  pipeline_tag: other
12
  ---
13
 
14
- # NTU Surface Code Decoder (AlphaQubit V2)
15
 
16
- Pre-trained neural decoder checkpoints for rotated surface codes, based on the
17
- **Neural Transfer Unification (NTU)** framework.
 
18
 
19
  πŸ“„ **Paper**: *Transfer Learning is All You Need for Scalable Neural Decoder*
 
20
 
21
- ## Model Architecture
22
 
23
- **AlphaQubit V2** β€” A high-capacity neural decoder (~58M parameters) featuring:
24
 
25
- - **Interleaved RNN-Transformer backbone** (5 GRU + 6 self-attention layers)
26
- - **2D Rotary Position Embedding (RoPE)** based on physical detector coordinates
27
- - **Joint X+Z stabilizer processing** with spatial hint connections
28
- - **Cross-attention readout** with learnable logical query tokens
29
- - Trained with **progressive knowledge distillation** from MWPM pseudo-labels
 
 
 
 
 
 
 
 
 
 
30
 
31
  ## Repository Structure
32
 
33
  ```
34
  ntu-surface-code-decoder/
35
  β”œβ”€β”€ README.md
36
- β”œβ”€β”€ surface/ ← Surface code checkpoints (AlphaQubit V2)
37
- β”‚ β”œβ”€β”€ d7.pth (~121 MB, scratch)
38
- β”‚ β”œβ”€β”€ d11.pth (~121 MB, transfer from d7)
39
- β”‚ β”œβ”€β”€ d15.pth (~121 MB, transfer from d11)
40
- β”‚ β”œβ”€β”€ d19.pth (~121 MB, transfer from d15)
41
- β”‚ β”œβ”€β”€ d23.pth (~121 MB, transfer from d19)
42
- β”‚ └── d25.pth (~122 MB, transfer from d23)
43
- └── bb/ ← BB code checkpoints (coming soon)
 
 
44
  ```
45
 
46
- Each checkpoint contains:
47
- - `model_state` β€” OrderedDict of model weights
48
- - `d` β€” code distance (int)
49
- - `rounds` β€” decoding rounds (int)
50
- - `step` β€” training step (int)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
51
 
52
  ## Usage
53
 
 
 
54
  ```python
55
  import torch
56
  from huggingface_hub import hf_hub_download
57
 
58
- # Download a surface code checkpoint
59
  ckpt_path = hf_hub_download(
60
  repo_id="Dreamworldsmile/ntu-surface-code-decoder",
61
  filename="surface/d7.pth",
62
  )
63
 
64
- # Load into AlphaQubit V2
65
  ckpt = torch.load(ckpt_path, map_location="cpu", weights_only=False)
66
  model.load_state_dict(
67
  {k.replace("_orig_mod.", "").replace("module.", ""): v
@@ -70,30 +108,123 @@ model.load_state_dict(
70
  )
71
  ```
72
 
73
- ### With the official code
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
74
 
75
  ```bash
76
- # Inference β€” auto-downloads surface/d{d}.pth
77
- python inference.py --hf_repo Dreamworldsmile/ntu-surface-code-decoder --d 7 --shots 100000
 
 
 
 
 
78
 
79
- # Transfer learning β€” specify full path within the repo
80
- bash train.sh --mode transfer \
81
- --hf_ckpt Dreamworldsmile/ntu-surface-code-decoder/surface/d7.pth --d 11 ...
82
  ```
83
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
84
  ## Authors
85
 
86
- Ge Yan, Shanchuan Li, **Shiyi Xiao**, Pengyue Ma, Hanyan Cao, Feng Pan, Yuxuan Du
 
 
 
 
 
 
 
87
 
88
- *Nanyang Technological University Β· TUAT Β· Shanghai Jiao Tong University Β· SUTD*
 
 
89
 
90
  ## Citation
91
 
 
 
 
92
  ```bibtex
93
  @article{ntu2026,
94
  title={Transfer Learning is All You Need for Scalable Neural Decoder},
95
  author={Yan, Ge and Li, Shanchuan and Xiao, Shiyi and Ma, Pengyue and
96
  Cao, Hanyan and Pan, Feng and Du, Yuxuan},
 
97
  year={2026},
98
  }
99
  ```
 
 
 
 
 
 
 
2
  language: en
3
  license: mit
4
  tags:
5
+ - qec
6
+ - surface-code
7
+ - quantum
8
+ - pytorch
9
+ - quantum-error-correction
10
+ - neural-decoder
11
+ - bivariate-bicycle
12
+ - ldpc
13
  pipeline_tag: other
14
  ---
15
 
16
+ # NTU Neural Decoder Checkpoints
17
 
18
+ Pre-trained neural decoder model weights for quantum error correction (QEC)
19
+ codes, based on the **Neural Transfer Unification (NTU)** framework introduced
20
+ in the accompanying paper.
21
 
22
  πŸ“„ **Paper**: *Transfer Learning is All You Need for Scalable Neural Decoder*
23
+ 🌐 **Project page**: [https://grahamyan.github.io/ntu-decoder/](https://grahamyan.github.io/ntu-decoder/)
24
 
25
+ ---
26
 
27
+ ## Overview
28
 
29
+ This repository hosts the official model checkpoints for two families of QEC
30
+ codes:
31
+
32
+ | Code family | Architecture | Decoder |
33
+ |---|---|---|
34
+ | Rotated surface code | AlphaQubit V2 (~58M parameters) | Transformer-based |
35
+ | Bivariate-bicycle (BB) code | AlphaQubitV2_BB (~XXM parameters) | Transformer-based |
36
+ | Bivariate-bicycle (BB) code | Neural Belief Propagation | GNN-based message passing |
37
+
38
+ All models are implemented in PyTorch and trained with distributed data-parallel
39
+ (DDP) across 8 GPUs. The surface code decoder uses progressive knowledge
40
+ distillation from minimum-weight perfect matching (MWPM) pseudo-labels;
41
+ the BB decoder is trained end-to-end on sampled syndromes.
42
+
43
+ ---
44
 
45
  ## Repository Structure
46
 
47
  ```
48
  ntu-surface-code-decoder/
49
  β”œβ”€β”€ README.md
50
+ β”œβ”€β”€ surface/ ← Surface code checkpoints (AlphaQubit V2)
51
+ β”‚ β”œβ”€β”€ d7.pth (121 MB, trained from scratch)
52
+ β”‚ β”œβ”€β”€ d11.pth (121 MB, transfer learning from d=7)
53
+ β”‚ β”œβ”€β”€ d15.pth (121 MB, transfer learning from d=11)
54
+ β”‚ β”œβ”€β”€ d19.pth (121 MB, transfer learning from d=15)
55
+ β”‚ β”œβ”€β”€ d23.pth (121 MB, transfer learning from d=19)
56
+ β”‚ └── d25.pth (122 MB, transfer learning from d=23)
57
+ └── bb/ ← BB code checkpoints
58
+ β”œβ”€β”€ bb72_transformer.pt (138 MB, AlphaQubitV2_BB, [[72,12,6]] code)
59
+ └── neural_bp_bb72.pt (1.2 MB, Neural-BP, [[72,12,6]] code)
60
  ```
61
 
62
+ ### Checkpoint format
63
+
64
+ **Surface code checkpoints** (`surface/*.pth`):
65
+ | Key | Type | Description |
66
+ |---|---|---|
67
+ | `model_state` | `OrderedDict` | Model weights (strip `_orig_mod.` and `module.` prefixes before loading) |
68
+ | `d` | `int` | Code distance |
69
+ | `rounds` | `int` | Syndrome extraction rounds |
70
+ | `step` | `int` | Training step at which the checkpoint was saved |
71
+
72
+ **BB Transformer checkpoints** (`bb/bb*_transformer.pt`):
73
+ | Key | Type | Description |
74
+ |---|---|---|
75
+ | `model_state` | `OrderedDict` | Model weights |
76
+ | `step` | `int` | Training step |
77
+ | `block_acc` | `float` | Block accuracy at save time |
78
+ | `per_log_mean` | `float` | Per-logical average accuracy |
79
+ | `output_convention` | `dict` | Logical observable convention metadata |
80
+
81
+ **Neural-BP checkpoints** (`bb/neural_bp_*.pt`):
82
+ | Key | Type | Description |
83
+ |---|---|---|
84
+ | (raw `state_dict`) | `OrderedDict` | Model weights (strip `module.` prefix before loading) |
85
+
86
+ ---
87
 
88
  ## Usage
89
 
90
+ ### Surface code β€” AlphaQubit V2
91
+
92
  ```python
93
  import torch
94
  from huggingface_hub import hf_hub_download
95
 
96
+ # Download a surface code checkpoint.
97
  ckpt_path = hf_hub_download(
98
  repo_id="Dreamworldsmile/ntu-surface-code-decoder",
99
  filename="surface/d7.pth",
100
  )
101
 
102
+ # Load into an AlphaQubit V2 model instance.
103
  ckpt = torch.load(ckpt_path, map_location="cpu", weights_only=False)
104
  model.load_state_dict(
105
  {k.replace("_orig_mod.", "").replace("module.", ""): v
 
108
  )
109
  ```
110
 
111
+ ### BB code β€” AlphaQubitV2_BB (Transformer)
112
+
113
+ ```python
114
+ import torch
115
+ from huggingface_hub import hf_hub_download
116
+
117
+ ckpt_path = hf_hub_download(
118
+ repo_id="Dreamworldsmile/ntu-surface-code-decoder",
119
+ filename="bb/bb72_transformer.pt",
120
+ )
121
+
122
+ ckpt = torch.load(ckpt_path, map_location="cpu")
123
+ state_dict = ckpt["model_state"]
124
+ state_dict = {k.replace("_orig_mod.", "").replace("module.", ""): v
125
+ for k, v in state_dict.items()}
126
+ # Filter to keys present in the model (skip logical_readout_bias).
127
+ model_sd = model.state_dict()
128
+ filtered = {k: v for k, v in state_dict.items()
129
+ if k in model_sd and model_sd[k].shape == v.shape
130
+ and k != "logical_readout_bias"}
131
+ model.load_state_dict(filtered, strict=False)
132
+ ```
133
+
134
+ ### BB code β€” Neural Belief Propagation
135
+
136
+ ```python
137
+ ckpt_path = hf_hub_download(
138
+ repo_id="Dreamworldsmile/ntu-surface-code-decoder",
139
+ filename="bb/neural_bp_bb72.pt",
140
+ )
141
+
142
+ ckpt = torch.load(ckpt_path, map_location="cpu", weights_only=True)
143
+ state_dict = {k.replace("module.", ""): v for k, v in ckpt.items()}
144
+ model.load_state_dict(state_dict, strict=True)
145
+ ```
146
+
147
+ ### Inference with the official code
148
+
149
+ The [official implementation](https://github.com/GrahamYan/ntu-decoder) provides a
150
+ unified inference launcher that automatically downloads the required checkpoint:
151
 
152
  ```bash
153
+ # Surface code inference.
154
+ bash inference.sh --code surface --d 7 \
155
+ --hf_repo Dreamworldsmile/ntu-surface-code-decoder --shots 100000
156
+
157
+ # BB Transformer inference.
158
+ bash inference.sh --code bb --model transformer --block_size 72 \
159
+ --hf_repo Dreamworldsmile/ntu-surface-code-decoder --shots 100000 --p 0.005
160
 
161
+ # BB Neural-BP inference.
162
+ bash inference.sh --code bb --model neural_bp --block_size 72 \
163
+ --hf_repo Dreamworldsmile/ntu-surface-code-decoder --shots 100000 --p 0.005
164
  ```
165
 
166
+ For training and baseline evaluations, please refer to the shell scripts under
167
+ `codes/Surface/` and `codes/BB/` in the source repository.
168
+
169
+ ---
170
+
171
+ ## Model Architecture
172
+
173
+ ### AlphaQubit V2 / AlphaQubitV2_BB
174
+
175
+ A high-capacity neural decoder featuring:
176
+
177
+ - **Interleaved RNN-Transformer backbone** (5 GRU + 6 self-attention layers)
178
+ - **2D Rotary Position Embedding (RoPE)** based on physical detector coordinates
179
+ - **Joint X+Z stabilizer processing** with spatial hint connections between
180
+ same-type and cross-type stabilizers
181
+ - **Cross-attention readout** with learnable logical query tokens
182
+ - Trained with **progressive knowledge distillation** from MWPM pseudo-labels
183
+ (surface code) or end-to-end on sampled syndromes (BB code)
184
+
185
+ ### Neural Belief Propagation
186
+
187
+ A graph-neural-network decoder operating on the Tanner graph of the code:
188
+
189
+ - **Bipartite message passing** between variable and check nodes
190
+ - **Gated recurrent units (GRU)** for message updates
191
+ - **Focal loss** with syndrome consistency regularization
192
+ - Compact model size (~300K parameters for BB72)
193
+
194
+ ---
195
+
196
  ## Authors
197
 
198
+ Ge Yan<sup>1</sup>, Shanchuan Li<sup>1,2</sup>, **Shiyi Xiao**<sup>1,3</sup>,
199
+ Pengyue Ma<sup>1</sup>, Hanyan Cao<sup>4</sup>, Feng Pan<sup>4,\*</sup>,
200
+ Yuxuan Du<sup>1,\*</sup>
201
+
202
+ <sup>1</sup> Nanyang Technological University &nbsp;
203
+ <sup>2</sup> Tokyo University of Agriculture and Technology &nbsp;
204
+ <sup>3</sup> Shanghai Jiao Tong University &nbsp;
205
+ <sup>4</sup> Singapore University of Technology and Design
206
 
207
+ <small><sup>\*</sup> Corresponding authors</small>
208
+
209
+ ---
210
 
211
  ## Citation
212
 
213
+ If you use these model weights or the NTU framework in your research, please
214
+ cite the accompanying paper:
215
+
216
  ```bibtex
217
  @article{ntu2026,
218
  title={Transfer Learning is All You Need for Scalable Neural Decoder},
219
  author={Yan, Ge and Li, Shanchuan and Xiao, Shiyi and Ma, Pengyue and
220
  Cao, Hanyan and Pan, Feng and Du, Yuxuan},
221
+ journal={arXiv preprint},
222
  year={2026},
223
  }
224
  ```
225
+
226
+ ---
227
+
228
+ ## License
229
+
230
+ This repository is released under the [MIT License](https://opensource.org/licenses/MIT).