toksuite
/

bigscience-bloom

Text Generation

text-generation-inference

Model card Files Files and versions

Malikeh1375 commited on Dec 22, 2025

Commit

aa2d8a0

·

verified ·

1 Parent(s): c48258c

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -66,13 +66,13 @@ This makes BLOOM a representative example of multilingual BPE tokenization.
 ## Model Architecture
-- **Architecture:** Decoder-only Transformer (LLaMA-style)
 - **Non-embedding parameters:** ~1B
 - **Context length:** 4096 tokens
 - **Framework:** Meta Lingua
-- **Initialization:** Shared super-vocabulary initialization for overlapping token strings
-The architecture and hyperparameters are fixed across TokSuite; the tokenizer is the only variable.
 ---

 ## Model Architecture
+- **Architecture:**  Decoder-only Transformer (Lingua's Llama-3.2-1B configuration)
 - **Non-embedding parameters:** ~1B
 - **Context length:** 4096 tokens
 - **Framework:** Meta Lingua
+- **Initialization:** Shared super-vocabulary initialization across TokSuite models
+The architecture and training setup are identical across all TokSuite models; only the tokenizer differs.
 ---