Improve model card: Add library_name, update primary paper link, and add GitHub link

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +8 -3
README.md CHANGED
@@ -1,5 +1,6 @@
1
  ---
2
  license: apache-2.0
 
3
  tags:
4
  - transformer
5
  - causal-lm
@@ -7,7 +8,7 @@ tags:
7
  - constructive-learning
8
  - frozen-embeddings
9
  - bvv
10
- pipeline_tag: text-generation
11
  ---
12
 
13
  # Model Card for abs-bvv-1
@@ -16,7 +17,7 @@ pipeline_tag: text-generation
16
 
17
  `abs-bvv-1` is a 1.3 billion parameter decoder-only Transformer model. It is the first model in the **Progressive Growth Transformers (PGT)** series, designed to explore how linguistic and reasoning capabilities emerge as a function of model depth.
18
 
19
- This model was not trained monolithically. Instead, it was "grown" constructively, one layer at a time, upon a foundation of **frozen, non-semantic visual embeddings**, as introduced in the paper "[Emergent Semantics Beyond Token Embeddings: Transformer LMs with Frozen Visual Unicode Representations](https://arxiv.org/abs/2507.04886)".
20
 
21
  The core idea is to demonstrate an alternative, more modular and resource-efficient paradigm for building LLMs. The PGT series shows that:
22
  1. Semantic understanding can emerge without trainable embeddings.
@@ -80,6 +81,9 @@ If you use this model or the underlying concepts in your research, please cite o
80
 
81
  This work demonstrates that transformer blocks, not token embeddings, carry the semantic burden in LLMs — a step toward modular, fusable, multilingual LMs.
82
 
 
 
 
83
  ## How to Use
84
 
85
  The model can be loaded using the `transformers` library. Note that `trust_remote_code=True` is required as it uses a custom model architecture.
@@ -103,4 +107,5 @@ outputs = model.generate(
103
  do_sample=True
104
  )
105
 
106
- print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 
 
1
  ---
2
  license: apache-2.0
3
+ pipeline_tag: text-generation
4
  tags:
5
  - transformer
6
  - causal-lm
 
8
  - constructive-learning
9
  - frozen-embeddings
10
  - bvv
11
+ library_name: transformers
12
  ---
13
 
14
  # Model Card for abs-bvv-1
 
17
 
18
  `abs-bvv-1` is a 1.3 billion parameter decoder-only Transformer model. It is the first model in the **Progressive Growth Transformers (PGT)** series, designed to explore how linguistic and reasoning capabilities emerge as a function of model depth.
19
 
20
+ This model was not trained monolithically. Instead, it was "grown" constructively, one layer at a time, upon a foundation of **frozen, non-semantic visual embeddings**, as introduced in the paper "[Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate](https://huggingface.co/papers/2507.07129)".
21
 
22
  The core idea is to demonstrate an alternative, more modular and resource-efficient paradigm for building LLMs. The PGT series shows that:
23
  1. Semantic understanding can emerge without trainable embeddings.
 
81
 
82
  This work demonstrates that transformer blocks, not token embeddings, carry the semantic burden in LLMs — a step toward modular, fusable, multilingual LMs.
83
 
84
+ ## Code
85
+ The code for this project and associated resources can be found on GitHub: [https://github.com/Bochkov/bvv-tokenizers](https://github.com/Bochkov/bvv-tokenizers).
86
+
87
  ## How to Use
88
 
89
  The model can be loaded using the `transformers` library. Note that `trust_remote_code=True` is required as it uses a custom model architecture.
 
107
  do_sample=True
108
  )
109
 
110
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
111
+ ```