Safetensors
qwen2
fp8
baicaihaochi121 commited on
Commit
4e4e430
·
verified ·
1 Parent(s): 9055bc2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -13
README.md CHANGED
@@ -10,19 +10,7 @@ license: apache-2.0
10
    <a href="https://infix-ai.com/research/infir2/">🌐 Project Website</a> &nbsp;
11
  </p>
12
 
13
- We performed supervised fine-tuning on the **InfiR2-7B-base-FP8** with FP8 format in two stages using the InfiAlign-SFT-72k and InfiAlign-SFT-165k datasets, with hyperparameters shown in below.
14
-
15
- <div align="center">
16
-
17
- | Parameter | Value |
18
- | :---: | :---: |
19
- | **Batch Size** | 64 |
20
- | **Learning Rate** | 1e-5 |
21
- | **Minimum Learning Rate** | 1e-6 |
22
- | **Weight Decay** | 0.05 |
23
- | **Context Length** | 32k |
24
-
25
- </div>
26
 
27
  The resulting model is the **InfiR2-7B-Instruct-FP8**.
28
 
@@ -35,6 +23,19 @@ The resulting model is the **InfiR2-7B-Instruct-FP8**.
35
  - Stable and Reproducible Performance
36
  - Efficient and Low memory Training
37
 
 
 
 
 
 
 
 
 
 
 
 
 
 
38
 
39
 
40
  ## 🚀 InfiR2 Model Series
 
10
    <a href="https://infix-ai.com/research/infir2/">🌐 Project Website</a> &nbsp;
11
  </p>
12
 
13
+ We performed supervised fine-tuning on the **InfiR2-7B-base-FP8** with FP8 format in two stages using the InfiAlign-SFT-72k and InfiAlign-SFT-165k datasets.
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
  The resulting model is the **InfiR2-7B-Instruct-FP8**.
16
 
 
23
  - Stable and Reproducible Performance
24
  - Efficient and Low memory Training
25
 
26
+ **Hyperparameters**:
27
+
28
+ <div align="center">
29
+
30
+ | Parameter | Value |
31
+ | :---: | :---: |
32
+ | **Batch Size** | 64 |
33
+ | **Learning Rate** | 1e-5 |
34
+ | **Minimum Learning Rate** | 1e-6 |
35
+ | **Weight Decay** | 0.05 |
36
+ | **Context Length** | 32k |
37
+
38
+ </div>
39
 
40
 
41
  ## 🚀 InfiR2 Model Series