Update README.md
Browse files
README.md
CHANGED
|
@@ -16,7 +16,7 @@ pipeline_tag: text-generation
|
|
| 16 |
|
| 17 |
`abs-bvv-3` is a 1.7 billion parameter decoder-only Transformer model. It is the 3th model in the **Progressive Growth Transformers (PGT)** series, designed to explore how linguistic and reasoning capabilities emerge as a function of model depth.
|
| 18 |
|
| 19 |
-
This model was not trained monolithically. Instead, it was "grown" constructively, one layer at a time, upon a foundation of **frozen, non-semantic visual embeddings**, as introduced in the paper "[Emergent Semantics Beyond Token Embeddings: Transformer LMs with Frozen Visual Unicode Representations](
|
| 20 |
|
| 21 |
The core idea is to demonstrate an alternative, more modular and resource-efficient paradigm for building LLMs. The PGT series shows that:
|
| 22 |
1. Semantic understanding can emerge without trainable embeddings.
|
|
|
|
| 16 |
|
| 17 |
`abs-bvv-3` is a 1.7 billion parameter decoder-only Transformer model. It is the 3th model in the **Progressive Growth Transformers (PGT)** series, designed to explore how linguistic and reasoning capabilities emerge as a function of model depth.
|
| 18 |
|
| 19 |
+
This model was not trained monolithically. Instead, it was "grown" constructively, one layer at a time, upon a foundation of **frozen, non-semantic visual embeddings**, as introduced in the paper "[Emergent Semantics Beyond Token Embeddings: Transformer LMs with Frozen Visual Unicode Representations](https://arxiv.org/abs/2507.04886)".
|
| 20 |
|
| 21 |
The core idea is to demonstrate an alternative, more modular and resource-efficient paradigm for building LLMs. The PGT series shows that:
|
| 22 |
1. Semantic understanding can emerge without trainable embeddings.
|