shamz15531 commited on
Commit
c819763
·
verified ·
1 Parent(s): d08b6bb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -43,11 +43,11 @@ We have published a comprehensive [report](https://arxiv.org/pdf/2501.13944) wit
43
  ## Model Training
44
 
45
  #### Pretraining
46
- Fanar was continually pretrained on 1T tokens, with a balanced focus on Arabic and English: ~515B English tokens from a carefully curated subset of the [Dolma](https://huggingface.co/datasets/allenai/dolma) dataset, 410B Arabic tokens that we collected, parsed, and flitered from a variety of sources, 102B code tokens curated from [The Stack](https://github.com/bigcode-project/the-stack-v2) dataset. Our codebase used the [LitGPT](https://github.com/Lightning-AI/litgpt) framework.
47
 
48
  ## Getting Started
49
 
50
- Fanar is compatible with the Hugging Face `transformers` library (≥ v4.40.0). Here's how to load and use the model:
51
 
52
  ```python
53
  from transformers import AutoTokenizer, AutoModelForCausalLM
 
43
  ## Model Training
44
 
45
  #### Pretraining
46
+ Fanar-1-9B was continually pretrained on 1T tokens, with a balanced focus on Arabic and English: ~515B English tokens from a carefully curated subset of the [Dolma](https://huggingface.co/datasets/allenai/dolma) dataset, 410B Arabic tokens that we collected, parsed, and flitered from a variety of sources, 102B code tokens curated from [The Stack](https://github.com/bigcode-project/the-stack-v2) dataset. Our codebase used the [LitGPT](https://github.com/Lightning-AI/litgpt) framework.
47
 
48
  ## Getting Started
49
 
50
+ Fanar-1-9B is compatible with the Hugging Face `transformers` library (≥ v4.40.0). Here's how to load and use the model:
51
 
52
  ```python
53
  from transformers import AutoTokenizer, AutoModelForCausalLM