shamz15531 commited on
Commit
8a5e0bc
·
verified ·
1 Parent(s): f9205a9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -1
README.md CHANGED
@@ -12,7 +12,9 @@ library_name: transformers
12
 
13
  **Fanar-1-9B-Instruct** is a powerful Arabic-English LLM developed by [Qatar Computing Research Institute (QCRI)](https://www.hbku.edu.qa/en/qcri) and [Hamad Bin Khalifa University (HBKU)](https://www.hbku.edu.qa/). It is the instruction-tuned version of [Fanar-1-9B](). Fanar continually pretrains the `google/gemma-2-9b` model on 1T Arabic and English tokens. Fanar pays particular attention to the richness of the Arabic language by supporting a diverse set of Arabic dialects including Modern Standard Arabic (MSA), Levantine, and Egyptian. Fanar, through meticulous curation of the pretraining and instruction-tuning data, is aligned with Arab cultural values.
14
 
15
- We have published a comprehensive [report](https://arxiv.org/pdf/2501.13944) with all the details regarding FANAR. We also provide an API to the model (request access [here](https://api.fanar.qa/request/en)).
 
 
16
 
17
  ---
18
 
@@ -21,6 +23,7 @@ We have published a comprehensive [report](https://arxiv.org/pdf/2501.13944) wit
21
  | Attribute | Value |
22
  |---------------------------|------------------------------------|
23
  | Developed by | [QCRI](https://www.hbku.edu.qa/en/qcri) and [HBKU](https://www.hbku.edu.qa/) |
 
24
  | Model Type | Autoregressive Transformer |
25
  | Parameter Count | 8.7 Billion |
26
  | Context Length | 4096 Tokens |
@@ -32,6 +35,7 @@ We have published a comprehensive [report](https://arxiv.org/pdf/2501.13944) wit
32
  | DPO Preference Pairs | 250K |
33
  | Languages | Arabic, English |
34
  | License | Apache 2.0 |
 
35
  <!-- | Precision | bfloat16 | -->
36
 
37
  ---
@@ -64,6 +68,7 @@ model_name = "QCRI/Fanar-1-9B-Instruct"
64
  tokenizer = AutoTokenizer.from_pretrained(model_name)
65
  model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
66
 
 
67
  messages = [
68
  {"role": "user", "content": "ما هي عاصمة قطر؟"},
69
  ]
 
12
 
13
  **Fanar-1-9B-Instruct** is a powerful Arabic-English LLM developed by [Qatar Computing Research Institute (QCRI)](https://www.hbku.edu.qa/en/qcri) and [Hamad Bin Khalifa University (HBKU)](https://www.hbku.edu.qa/). It is the instruction-tuned version of [Fanar-1-9B](). Fanar continually pretrains the `google/gemma-2-9b` model on 1T Arabic and English tokens. Fanar pays particular attention to the richness of the Arabic language by supporting a diverse set of Arabic dialects including Modern Standard Arabic (MSA), Levantine, and Egyptian. Fanar, through meticulous curation of the pretraining and instruction-tuning data, is aligned with Arab cultural values.
14
 
15
+ **Fanar-1-9B-Instruct** is a core component within the [Fanar GenAI platform](https://chat.fanar.qa/) that offers a suite of capabilities including image generation, video and image understanding, deep thinking, advanced text-to-speech (TTS) and automatic-speech-recognition (ASR), attribution and fact-checking, Islamic RAG, among several other features.
16
+
17
+ We have published a comprehensive [report](https://arxiv.org/pdf/2501.13944) with all the details regarding FANAR. We also provide an API to the model and our GenAI platform (request access [here](https://api.fanar.qa/request/en)).
18
 
19
  ---
20
 
 
23
  | Attribute | Value |
24
  |---------------------------|------------------------------------|
25
  | Developed by | [QCRI](https://www.hbku.edu.qa/en/qcri) and [HBKU](https://www.hbku.edu.qa/) |
26
+ | Sponsored by | [MCIT](https://www.mcit.gov.qa/en/)
27
  | Model Type | Autoregressive Transformer |
28
  | Parameter Count | 8.7 Billion |
29
  | Context Length | 4096 Tokens |
 
35
  | DPO Preference Pairs | 250K |
36
  | Languages | Arabic, English |
37
  | License | Apache 2.0 |
38
+
39
  <!-- | Precision | bfloat16 | -->
40
 
41
  ---
 
68
  tokenizer = AutoTokenizer.from_pretrained(model_name)
69
  model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
70
 
71
+ # message content may be in Arabic or English
72
  messages = [
73
  {"role": "user", "content": "ما هي عاصمة قطر؟"},
74
  ]