InfiX-ai
/

InfiR2-R1-7B-FP8-Preview

Safetensors

qwen2

fp8

Model card Files Files and versions

xet

Community

juezhi commited on Oct 14, 2025

Commit

5f3cb04

verified ·

1 Parent(s): 1d161f9

Update README.md

Browse files

Files changed (1) hide show

README.md +2 -9

README.md CHANGED Viewed

@@ -17,12 +17,12 @@ We performed **Reinforcement Learning (RL)** on the **InfiR2-7B-Instruct-FP8** m
 | Parameter | Value |
 | :---: | :---: |
-| **Batch Size** | 128 |
 | **N Samples Per Prompt** | 16 |
 | **Global Batch Size** | 2048 |
 | **Maximum Response Length** | 16384 |
 | **Rollout Temperature** | 1.1 |
-| **Learning Rate** | 1e-6 |
 | **Weight Decay** | 0.1 |
 | **Eps Clip** | 0.2 |
 | **KL Loss Coefficient** | 0.00 |
@@ -40,7 +40,6 @@ The resulting model is the **InfiR2-R1-7B-FP8**.
 - Stable and Reproducible Performance
 - Efficient and Low memory Training
----
 ## 🚀 InfiR2 Model Series
@@ -54,7 +53,6 @@ The InfiR2 framework offers multiple variants model with different size and trai
 - [InfiR2-7B-Instruct-FP8](https://huggingface.co/InfiX-ai/InfiR2-7B-Instruct-FP8): *Supervised fine-tuning on InfiR2-7B-base-FP8 with [InfiAlign dataset](https://huggingface.co/papers/2508.05496)*
 - [InfiR2-R1-7B-FP8](https://huggingface.co/InfiX-ai/InfiR2-R1-7B-FP8): *Reinforcement learning on InfiR2-7B-Instruct-FP8 with dapo dataset*
----
 ## 📊 Model Performance
 Below is the performance comparison of **InfiR2-R1-7B-FP8** on reasoning benchmarks. Note: 'w. InfiAlign' denotes Supervised Fine-Tuning (SFT) using the InfiAlign dataset.
@@ -99,7 +97,6 @@ Below is the performance comparison of **InfiR2-R1-7B-FP8** on reasoning benchma
 </div>
----
 ## 🎭 Quick Start
@@ -149,7 +146,6 @@ print(f"(LLM Response): \n{llm_response}")
 print("="*70)
 ````
------
 ## 📚 Model Download
@@ -160,7 +156,6 @@ mkdir -p ./models
 huggingface-cli download --resume-download InfiX-ai/InfiR2-R1-7B-FP8 --local-dir ./models/InfiR2-R1-7B-FP8
 ```
------
 ## 🎯 Intended Uses
@@ -180,13 +175,11 @@ The model should **not** be used for:
   - Generating harmful, offensive, or inappropriate content
   - Creating misleading information
------
 ## 🙏 Acknowledgements
   * We would like to express our gratitude for the following open-source projects: [Slime](https://github.com/THUDM/slime), [Megatron](https://github.com/NVIDIA/Megatron-LM), [TransformerEngine](https://github.com/NVIDIA/TransformerEngine) and [Qwen2.5](https://github.com/QwenLM/Qwen2.5-Math).
------
 ## 📌 Citation

 | Parameter | Value |
 | :---: | :---: |
+| **Batch Size (train\_prompt\_bsz)** | 128 |
 | **N Samples Per Prompt** | 16 |
 | **Global Batch Size** | 2048 |
 | **Maximum Response Length** | 16384 |
 | **Rollout Temperature** | 1.1 |
+| **Learning Rate (LR)** | 1e-6 |
 | **Weight Decay** | 0.1 |
 | **Eps Clip** | 0.2 |
 | **KL Loss Coefficient** | 0.00 |
 - Stable and Reproducible Performance
 - Efficient and Low memory Training
 ## 🚀 InfiR2 Model Series
 - [InfiR2-7B-Instruct-FP8](https://huggingface.co/InfiX-ai/InfiR2-7B-Instruct-FP8): *Supervised fine-tuning on InfiR2-7B-base-FP8 with [InfiAlign dataset](https://huggingface.co/papers/2508.05496)*
 - [InfiR2-R1-7B-FP8](https://huggingface.co/InfiX-ai/InfiR2-R1-7B-FP8): *Reinforcement learning on InfiR2-7B-Instruct-FP8 with dapo dataset*
 ## 📊 Model Performance
 Below is the performance comparison of **InfiR2-R1-7B-FP8** on reasoning benchmarks. Note: 'w. InfiAlign' denotes Supervised Fine-Tuning (SFT) using the InfiAlign dataset.
 </div>
 ## 🎭 Quick Start
 print("="*70)
 ````
 ## 📚 Model Download
 huggingface-cli download --resume-download InfiX-ai/InfiR2-R1-7B-FP8 --local-dir ./models/InfiR2-R1-7B-FP8
 ```
 ## 🎯 Intended Uses
   - Generating harmful, offensive, or inappropriate content
   - Creating misleading information
 ## 🙏 Acknowledgements
   * We would like to express our gratitude for the following open-source projects: [Slime](https://github.com/THUDM/slime), [Megatron](https://github.com/NVIDIA/Megatron-LM), [TransformerEngine](https://github.com/NVIDIA/TransformerEngine) and [Qwen2.5](https://github.com/QwenLM/Qwen2.5-Math).
 ## 📌 Citation