FineMath-Llama-3B

loubnabnl HF Staff commited on Jan 6, 2025

Commit

e3a37c1

verified ·

1 Parent(s): 82e8923

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -24,7 +24,7 @@ It was trained on **160B tokens** using a mix of 40% FineWeb-Edu and 60% from Fi
 ### Intended use
-This model was trained on English math data and is not instruction-tuned, making it intended for text completion in English.
 ### Generation
@@ -43,21 +43,6 @@ outputs = model.generate(inputs)
 print(tokenizer.decode(outputs[0]))
 ```
-## Intermediate checkpoints
-We are releasing intermediate checkpoints for this model at intervals of every 10000 training steps (10B tokens) in separate branches. The naming convention is `10B`.
-You can load a specific model revision with `transformers` using the argument `revision`:
-```python
-model = AutoModelForCausalLM.from_pretrained("HuggingFaceTB/FineMath-Llama-3B", revision="10B")
-```
-You can access all the revisions for the models via the following code:
-```python
-from huggingface_hub import list_repo_refs
-out = list_repo_refs("HuggingFaceTB/FineMath-Llama-3B")
-print([b.name for b in out.branches])
-```
 ## Training
 ### Model
 - **Architecture**: Llama3

 ### Intended use
+This model was trained on English math data and is not instruction-tuned, making it intended for text completion in English. It is part of the FineMath ablation models we trained for FineMath (https://huggingface.co/HuggingFaceTB/finemath-ablation-4plus-160B), and is not necessarily the best possible outcome achievable with the given dataset.
 ### Generation
 print(tokenizer.decode(outputs[0]))
 ```
 ## Training
 ### Model
 - **Architecture**: Llama3