Pinkstack commited on
Commit
cd5e4a5
·
verified ·
1 Parent(s): 7a22456

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -1
README.md CHANGED
@@ -7,10 +7,14 @@ tags:
7
  - llama
8
  - trl
9
  - dpo
 
 
 
10
  license: llama3.2
11
  language:
12
  - en
13
  pipeline_tag: text-generation
 
14
  ---
15
  😁:```Hi Fijik!```
16
 
@@ -24,7 +28,7 @@ After merging, we used a custom dataset mix meant for this model, to improve its
24
  - **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
25
  After these two steps, we got a powerful model which has less parameters than llama 3.1 8B yet performs just as good if not better, Note that unlike our other models, it is not a thinking model. our theory behind this model is that a smaller yet deeper model can outperform for it's size.
26
 
27
- Meta states that LLAMA 3.2 was pre-trained on up to 9 trillion high quality tokens.
28
 
29
  # What should Fijik be used for?
30
  Fijik
 
7
  - llama
8
  - trl
9
  - dpo
10
+ - roleplay
11
+ - math
12
+ - code
13
  license: llama3.2
14
  language:
15
  - en
16
  pipeline_tag: text-generation
17
+ library_name: transformers
18
  ---
19
  😁:```Hi Fijik!```
20
 
 
28
  - **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
29
  After these two steps, we got a powerful model which has less parameters than llama 3.1 8B yet performs just as good if not better, Note that unlike our other models, it is not a thinking model. our theory behind this model is that a smaller yet deeper model can outperform for it's size.
30
 
31
+ Meta states that LLAMA 3.2 was pre-trained on up to 9 trillion high quality tokens, with a knowledge cutoff date of December 2023.
32
 
33
  # What should Fijik be used for?
34
  Fijik