Update README.md
Browse files
README.md
CHANGED
|
@@ -24,7 +24,7 @@ It was trained on **160B tokens** using a mix of 40% FineWeb-Edu and 60% from Fi
|
|
| 24 |
|
| 25 |
### Intended use
|
| 26 |
|
| 27 |
-
This model was trained on English math data and is not instruction-tuned, making it intended for text completion in English.
|
| 28 |
|
| 29 |
### Generation
|
| 30 |
|
|
@@ -43,21 +43,6 @@ outputs = model.generate(inputs)
|
|
| 43 |
print(tokenizer.decode(outputs[0]))
|
| 44 |
```
|
| 45 |
|
| 46 |
-
## Intermediate checkpoints
|
| 47 |
-
|
| 48 |
-
We are releasing intermediate checkpoints for this model at intervals of every 10000 training steps (10B tokens) in separate branches. The naming convention is `10B`.
|
| 49 |
-
|
| 50 |
-
You can load a specific model revision with `transformers` using the argument `revision`:
|
| 51 |
-
```python
|
| 52 |
-
model = AutoModelForCausalLM.from_pretrained("HuggingFaceTB/FineMath-Llama-3B", revision="10B")
|
| 53 |
-
```
|
| 54 |
-
You can access all the revisions for the models via the following code:
|
| 55 |
-
```python
|
| 56 |
-
from huggingface_hub import list_repo_refs
|
| 57 |
-
out = list_repo_refs("HuggingFaceTB/FineMath-Llama-3B")
|
| 58 |
-
print([b.name for b in out.branches])
|
| 59 |
-
```
|
| 60 |
-
|
| 61 |
## Training
|
| 62 |
### Model
|
| 63 |
- **Architecture**: Llama3
|
|
|
|
| 24 |
|
| 25 |
### Intended use
|
| 26 |
|
| 27 |
+
This model was trained on English math data and is not instruction-tuned, making it intended for text completion in English. It is part of the FineMath ablation models we trained for FineMath (https://huggingface.co/HuggingFaceTB/finemath-ablation-4plus-160B), and is not necessarily the best possible outcome achievable with the given dataset.
|
| 28 |
|
| 29 |
### Generation
|
| 30 |
|
|
|
|
| 43 |
print(tokenizer.decode(outputs[0]))
|
| 44 |
```
|
| 45 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 46 |
## Training
|
| 47 |
### Model
|
| 48 |
- **Architecture**: Llama3
|