Safetensors
qwen2
fp8
juezhi commited on
Commit
aef6860
·
verified ·
1 Parent(s): 2af409f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -18,12 +18,12 @@ We performed **Reinforcement Learning (RL)** on the **InfiR2-7B-Instruct-FP8** m
18
 
19
  | Parameter | Value |
20
  | :---: | :---: |
21
- | **Batch Size (train\_prompt\_bsz)** | 128 |
22
  | **N Samples Per Prompt** | 16 |
23
  | **Global Batch Size** | 2048 |
24
  | **Maximum Response Length** | 16384 |
25
  | **Rollout Temperature** | 1.1 |
26
- | **Learning Rate (LR)** | 1e-6 |
27
  | **Weight Decay** | 0.1 |
28
  | **Eps Clip** | 0.2 |
29
  | **KL Loss Coefficient** | 0.00 |
 
18
 
19
  | Parameter | Value |
20
  | :---: | :---: |
21
+ | **Batch Size** | 128 |
22
  | **N Samples Per Prompt** | 16 |
23
  | **Global Batch Size** | 2048 |
24
  | **Maximum Response Length** | 16384 |
25
  | **Rollout Temperature** | 1.1 |
26
+ | **Learning Rate** | 1e-6 |
27
  | **Weight Decay** | 0.1 |
28
  | **Eps Clip** | 0.2 |
29
  | **KL Loss Coefficient** | 0.00 |