Update README.md
Browse files
README.md
CHANGED
|
@@ -24,6 +24,8 @@ This repository provides **Model_64_BIT (272M)** — an **ablation model** from
|
|
| 24 |
|
| 25 |
[📚 Paper (Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate)](https://huggingface.co/papers/2507.07129) -
|
| 26 |
|
|
|
|
|
|
|
| 27 |
This checkpoint is designed to test whether a Transformer can learn robust language behavior when the **entire input embedding layer is frozen** and contains **no semantic or visual signal**.
|
| 28 |
|
| 29 |
Compared to **Model_16_BIT**, this model uses a larger frozen binary code (`n_embed=64`), but the codes are **randomly generated** rather than encoding the token index directly.
|
|
|
|
| 24 |
|
| 25 |
[📚 Paper (Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate)](https://huggingface.co/papers/2507.07129) -
|
| 26 |
|
| 27 |
+
[📚 Blog Article](https://huggingface.co/blog/Bochkov/emergent-semantics-beyond-token-embeddings)
|
| 28 |
+
|
| 29 |
This checkpoint is designed to test whether a Transformer can learn robust language behavior when the **entire input embedding layer is frozen** and contains **no semantic or visual signal**.
|
| 30 |
|
| 31 |
Compared to **Model_16_BIT**, this model uses a larger frozen binary code (`n_embed=64`), but the codes are **randomly generated** rather than encoding the token index directly.
|