Safetensors
qwen2
fp8
juezhi commited on
Commit
5d1a0c6
·
verified ·
1 Parent(s): 5f3cb04

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -17,12 +17,12 @@ We performed **Reinforcement Learning (RL)** on the **InfiR2-7B-Instruct-FP8** m
17
 
18
  | Parameter | Value |
19
  | :---: | :---: |
20
- | **Batch Size (train\_prompt\_bsz)** | 128 |
21
  | **N Samples Per Prompt** | 16 |
22
  | **Global Batch Size** | 2048 |
23
  | **Maximum Response Length** | 16384 |
24
  | **Rollout Temperature** | 1.1 |
25
- | **Learning Rate (LR)** | 1e-6 |
26
  | **Weight Decay** | 0.1 |
27
  | **Eps Clip** | 0.2 |
28
  | **KL Loss Coefficient** | 0.00 |
 
17
 
18
  | Parameter | Value |
19
  | :---: | :---: |
20
+ | **Batch Size** | 128 |
21
  | **N Samples Per Prompt** | 16 |
22
  | **Global Batch Size** | 2048 |
23
  | **Maximum Response Length** | 16384 |
24
  | **Rollout Temperature** | 1.1 |
25
+ | **Learning Rate** | 1e-6 |
26
  | **Weight Decay** | 0.1 |
27
  | **Eps Clip** | 0.2 |
28
  | **KL Loss Coefficient** | 0.00 |