Bochkov commited on
Commit
663e71c
·
verified ·
1 Parent(s): 3a0b094

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -16,7 +16,7 @@ pipeline_tag: text-generation
16
 
17
  `abs-bvv-3` is a 1.7 billion parameter decoder-only Transformer model. It is the 3th model in the **Progressive Growth Transformers (PGT)** series, designed to explore how linguistic and reasoning capabilities emerge as a function of model depth.
18
 
19
- This model was not trained monolithically. Instead, it was "grown" constructively, one layer at a time, upon a foundation of **frozen, non-semantic visual embeddings**, as introduced in the paper "[Emergent Semantics Beyond Token Embeddings: Transformer LMs with Frozen Visual Unicode Representations](arXiv.org)".
20
 
21
  The core idea is to demonstrate an alternative, more modular and resource-efficient paradigm for building LLMs. The PGT series shows that:
22
  1. Semantic understanding can emerge without trainable embeddings.
 
16
 
17
  `abs-bvv-3` is a 1.7 billion parameter decoder-only Transformer model. It is the 3th model in the **Progressive Growth Transformers (PGT)** series, designed to explore how linguistic and reasoning capabilities emerge as a function of model depth.
18
 
19
+ This model was not trained monolithically. Instead, it was "grown" constructively, one layer at a time, upon a foundation of **frozen, non-semantic visual embeddings**, as introduced in the paper "[Emergent Semantics Beyond Token Embeddings: Transformer LMs with Frozen Visual Unicode Representations](https://arxiv.org/abs/2507.04886)".
20
 
21
  The core idea is to demonstrate an alternative, more modular and resource-efficient paradigm for building LLMs. The PGT series shows that:
22
  1. Semantic understanding can emerge without trainable embeddings.