Bochkov
/

abs-bvv-1

@@ -1,5 +1,6 @@
 ---
 license: apache-2.0
 tags:
 - transformer
 - causal-lm
@@ -7,7 +8,7 @@ tags:
 - constructive-learning
 - frozen-embeddings
 - bvv
-pipeline_tag: text-generation
 ---
 # Model Card for abs-bvv-1
@@ -16,7 +17,7 @@ pipeline_tag: text-generation
 `abs-bvv-1` is a 1.3 billion parameter decoder-only Transformer model. It is the first model in the **Progressive Growth Transformers (PGT)** series, designed to explore how linguistic and reasoning capabilities emerge as a function of model depth.
-This model was not trained monolithically. Instead, it was "grown" constructively, one layer at a time, upon a foundation of **frozen, non-semantic visual embeddings**, as introduced in the paper "[Emergent Semantics Beyond Token Embeddings: Transformer LMs with Frozen Visual Unicode Representations](https://arxiv.org/abs/2507.04886)".
 The core idea is to demonstrate an alternative, more modular and resource-efficient paradigm for building LLMs. The PGT series shows that:
 1.  Semantic understanding can emerge without trainable embeddings.
@@ -80,6 +81,9 @@ If you use this model or the underlying concepts in your research, please cite o
 This work demonstrates that transformer blocks, not token embeddings, carry the semantic burden in LLMs — a step toward modular, fusable, multilingual LMs.
 ## How to Use
 The model can be loaded using the `transformers` library. Note that `trust_remote_code=True` is required as it uses a custom model architecture.
@@ -103,4 +107,5 @@ outputs = model.generate(
     do_sample=True
 )
-print(tokenizer.decode(outputs[0], skip_special_tokens=True))

 ---
 license: apache-2.0
+pipeline_tag: text-generation
 tags:
 - transformer
 - causal-lm
 - constructive-learning
 - frozen-embeddings
 - bvv
+library_name: transformers
 ---
 # Model Card for abs-bvv-1
 `abs-bvv-1` is a 1.3 billion parameter decoder-only Transformer model. It is the first model in the **Progressive Growth Transformers (PGT)** series, designed to explore how linguistic and reasoning capabilities emerge as a function of model depth.
+This model was not trained monolithically. Instead, it was "grown" constructively, one layer at a time, upon a foundation of **frozen, non-semantic visual embeddings**, as introduced in the paper "[Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate](https://huggingface.co/papers/2507.07129)".
 The core idea is to demonstrate an alternative, more modular and resource-efficient paradigm for building LLMs. The PGT series shows that:
 1.  Semantic understanding can emerge without trainable embeddings.
 This work demonstrates that transformer blocks, not token embeddings, carry the semantic burden in LLMs — a step toward modular, fusable, multilingual LMs.
+## Code
+The code for this project and associated resources can be found on GitHub: [https://github.com/Bochkov/bvv-tokenizers](https://github.com/Bochkov/bvv-tokenizers).
 ## How to Use
 The model can be loaded using the `transformers` library. Note that `trust_remote_code=True` is required as it uses a custom model architecture.
     do_sample=True
 )
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```