Spaces:

BEE-spoke-data
/

beecoder-playground

Sleeping

pszemraj commited on Nov 1, 2023

Commit

cc59e08

1 Parent(s): 45d12c1

✏️

Signed-off-by: peter szemraj <peterszemraj@gmail.com>

Files changed (1) hide show

README.md CHANGED Viewed

@@ -28,7 +28,7 @@ In our country, we say _"To let 100M parameters model generate python script and
 ## Base Model Information
-The base model, smol_llama-101M-GQA, was pretrained on a relatively few (< ~20B) high-quality tokens. It is tiny in size (101M parameters) but relatively powerful in performance. The training for the base model included datasets such as:
 - [JeanKaddour/minipile](https://huggingface.co/datasets/JeanKaddour/minipile)
 - [pszemraj/simple_wikipedia_LM](https://huggingface.co/datasets/pszemraj/simple_wikipedia_LM)

 ## Base Model Information
+The base model, smol_llama-101M-GQA, has been pre-trained on a relatively small number of high quality tokens (less than ~20B). It has impressive performance despite its compact size of 101M parameters. Training data for this base model included:
 - [JeanKaddour/minipile](https://huggingface.co/datasets/JeanKaddour/minipile)
 - [pszemraj/simple_wikipedia_LM](https://huggingface.co/datasets/pszemraj/simple_wikipedia_LM)