bigscience
/

bloom-560m

@@ -193,6 +193,8 @@ The BLOOM tokenizer ([link](https://huggingface.co/bigscience/tokenizer)) is a l
 - A vocabulary size of 250,680
 It was trained on a subset of a preliminary version of the corpus using alpha-weighting per language.
 </details>

 - A vocabulary size of 250,680
+The vocabulary size was padded to 250,880 for practical purposes during training, but the effective model vocabulary size is 250,680.
 It was trained on a subset of a preliminary version of the corpus using alpha-weighting per language.
 </details>