RobBobin commited on
Commit
e57863f
Β·
verified Β·
1 Parent(s): 34889ab

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +32 -2
README.md CHANGED
@@ -9,6 +9,8 @@ tags:
9
  - semistandard-tableaux
10
  - reverse-plane-partitions
11
  - hillman-grassl
 
 
12
  - transformer
13
  - pytorch
14
  datasets:
@@ -19,12 +21,13 @@ pipeline_tag: other
19
 
20
  # RSK Transformer
21
 
22
- A transformer that learns **inverse combinatorial bijections** β€” the Robinson-Schensted-Knuth correspondence (permutations and matrices) and the Hillman-Grassl correspondence (reverse plane partitions). The same architecture handles all tasks without modification.
23
 
24
- Achieves **100% exact-match accuracy** on held-out test data for permutations at n=10, **99.99%** at n=15 (1.3 trillion permutations), **100%** on 3Γ—3 matrix RSK, and **100%** on reverse plane partitions of shape (4,3,2,1) β€” significantly improving on the [PNNL ML4AlgComb benchmark](https://github.com/pnnl/ML4AlgComb/tree/master/rsk). Scales to 5Γ—5 matrices (96.8% exact match on a space of ~10¹⁴).
25
 
26
  πŸ“„ **Paper**: [paper.pdf](paper.pdf)
27
  πŸ’» **Code**: [github.com/RaggedR/rsk-transformer](https://github.com/RaggedR/rsk-transformer)
 
28
 
29
  ## Results
30
 
@@ -75,6 +78,19 @@ Given a reverse plane partition (RPP) of shape Ξ», recover the arbitrary filling
75
 
76
  The Hillman-Grassl bijection is fundamentally different from RSK β€” it involves zigzag paths through the Young diagram rather than Schensted insertion β€” yet the same transformer architecture learns it to near-perfect accuracy. Tall shapes converge slower (36 epochs vs 17-23) because longer zigzag paths create longer-range dependencies.
77
 
 
 
 
 
 
 
 
 
 
 
 
 
 
78
  ## Key Idea: Structured 2D Token Embeddings
79
 
80
  Previous work encoded tableaux as flat bracket strings, destroying 2D geometry. We encode each tableau entry as a token with four learned embeddings:
@@ -124,6 +140,14 @@ Input: (P, Q) as 2n structured tokens
124
  | `checkpoints/encoder_rpp_6x4x2_m4/best.pt` | RSKEncoder on RPP shape (6,4,2), max_entry=4 | ~1.2M |
125
  | `checkpoints/encoder_rpp_2x2x2x2x2x1_m4/best.pt` | RSKEncoder on RPP shape (2,2,2,2,2,1), max_entry=4 | ~1.2M |
126
 
 
 
 
 
 
 
 
 
127
  ### Loading a checkpoint
128
 
129
  ```python
@@ -161,6 +185,12 @@ python train.py --model encoder --task rpp --shape 4,3,2,1 --max-entry 4 \
161
  --source sample --train-size 500000
162
  python train.py --model encoder --task rpp --shape 6,4,2 --max-entry 4 \
163
  --source sample --train-size 500000
 
 
 
 
 
 
164
  ```
165
 
166
  ## Citation
 
9
  - semistandard-tableaux
10
  - reverse-plane-partitions
11
  - hillman-grassl
12
+ - cylindric-plane-partitions
13
+ - growth-diagrams
14
  - transformer
15
  - pytorch
16
  datasets:
 
21
 
22
  # RSK Transformer
23
 
24
+ A transformer that learns **inverse combinatorial bijections** β€” the Robinson-Schensted-Knuth correspondence (permutations and matrices), the Hillman-Grassl correspondence (reverse plane partitions), and the cylindric growth diagram bijection (cylindric plane partitions). The same architecture handles all tasks without modification.
25
 
26
+ Achieves **100% exact-match accuracy** on held-out test data for permutations at n=10, **99.99%** at n=15 (1.3 trillion permutations), **100%** on 3Γ—3 matrix RSK, **100%** on reverse plane partitions of shape (4,3,2,1), and **100%** on cylindric plane partitions β€” significantly improving on the [PNNL ML4AlgComb benchmark](https://github.com/pnnl/ML4AlgComb/tree/master/rsk). Scales to 5Γ—5 matrices (96.8% exact match on a space of ~10¹⁴).
27
 
28
  πŸ“„ **Paper**: [paper.pdf](paper.pdf)
29
  πŸ’» **Code**: [github.com/RaggedR/rsk-transformer](https://github.com/RaggedR/rsk-transformer)
30
+ πŸ“˜ **Thesis**: [Langer (2013) β€” Cylindric plane partitions, Lambda determinants, Commutants in semicircular systems](https://arxiv.org/abs/2110.12629) β€” the mathematical foundation for the cylindric growth diagram bijection (Β§4.2–4.3) and generalized RSK via Fomin growth diagrams (Β§2.1–2.2)
31
 
32
  ## Results
33
 
 
78
 
79
  The Hillman-Grassl bijection is fundamentally different from RSK β€” it involves zigzag paths through the Young diagram rather than Schensted insertion β€” yet the same transformer architecture learns it to near-perfect accuracy. Tall shapes converge slower (36 epochs vs 17-23) because longer zigzag paths create longer-range dependencies.
80
 
81
+ ### Experiment 4: Cylindric Plane Partitions (Growth Diagrams)
82
+
83
+ Given a cylindric plane partition (CPP) with binary profile Ο€, recover the base partition Ξ³ and the ALCD face labels via the inverse cylindric growth diagram bijection. This uses the **Burge local rule** applied recursively through a cylindric growth diagram, as described in [Langer (2013), Β§4.2–4.3](https://arxiv.org/abs/2110.12629). **Same model architecture**.
84
+
85
+ | Profile Ο€ | T | ALCD labels | Training data | Test exact match | Per-position | Best epoch |
86
+ |-----------|---|-------------|--------------|-----------------|-------------|------------|
87
+ | (1,0,1,0) | 4 | 3 | 500,000 | **100.00%** | **100.00%** | 2 |
88
+ | (1,0,1,0,0) | 5 | 5 | 500,000 | **100.00%** | **100.00%** | 7 |
89
+ | (1,0,1,0,1,0) | 6 | 6 | 500,000 | **100.00%** | **100.00%** | 3 |
90
+ | (1,0,1,0,1,0,1,0) | 8 | 10 | 500,000 | **99.98%** | **100.00%** | 9 |
91
+
92
+ The cylindric bijection is qualitatively different from all previous experiments: there is no direct closed-form algorithm. The bijection is defined implicitly by the Burge local rule applied at each face of the cylindric growth diagram. The model must learn to invert a recursive process (the 𝔏_i composition from [Langer 2013, Β§4.2](https://arxiv.org/abs/2110.12629)) that peels off one ALCD label at each step by solving a local Burge equation. Despite this complexity, the transformer achieves 100% on all tested profiles.
93
+
94
  ## Key Idea: Structured 2D Token Embeddings
95
 
96
  Previous work encoded tableaux as flat bracket strings, destroying 2D geometry. We encode each tableau entry as a token with four learned embeddings:
 
140
  | `checkpoints/encoder_rpp_6x4x2_m4/best.pt` | RSKEncoder on RPP shape (6,4,2), max_entry=4 | ~1.2M |
141
  | `checkpoints/encoder_rpp_2x2x2x2x2x1_m4/best.pt` | RSKEncoder on RPP shape (2,2,2,2,2,1), max_entry=4 | ~1.2M |
142
 
143
+ ### Experiment 4: Cylindric Plane Partitions
144
+
145
+ | File | Description | Parameters |
146
+ |------|-------------|-----------|
147
+ | `checkpoints/encoder_cyl_1010_m3/best.pt` | RSKEncoder on CPP profile (1,0,1,0), max_label=3 | ~1.2M |
148
+ | `checkpoints/encoder_cyl_10100_m3/best.pt` | RSKEncoder on CPP profile (1,0,1,0,0), max_label=3 | ~1.2M |
149
+ | `checkpoints/encoder_cyl_101010_m3/best.pt` | RSKEncoder on CPP profile (1,0,1,0,1,0), max_label=3 | ~1.2M |
150
+
151
  ### Loading a checkpoint
152
 
153
  ```python
 
185
  --source sample --train-size 500000
186
  python train.py --model encoder --task rpp --shape 6,4,2 --max-entry 4 \
187
  --source sample --train-size 500000
188
+
189
+ # --- Experiment 4: Cylindric Plane Partitions ---
190
+ python train.py --model encoder --task cylindric --profile 1010 --max-label 3 \
191
+ --source sample --train-size 500000
192
+ python train.py --model encoder --task cylindric --profile 101010 --max-label 3 \
193
+ --source sample --train-size 500000
194
  ```
195
 
196
  ## Citation