| | --- |
| | license: apache-2.0 |
| | --- |
| | |
| | # InfiR2-1.5B-Instruct-FP8 |
| |
|
| | <p align="center"> |
| | Β <a href="https://arxiv.org/abs/2509.22536">π Paper</a> | |
| | <a href="https://github.com/InfiXAI/InfiR2">π Github</a> | |
| | Β <a href="https://infix-ai.com/research/infir2/">π Project Website</a> |
| | </p> |
| |
|
| | We performed supervised fine-tuning on the **InfiR2-1.5B-base-FP8** with FP8 format in two stages using the InfiAlign-SFT-72k and InfiAlign-SFT-165k datasets. |
| |
|
| |
|
| | **Training Recipe**: |
| |
|
| | <p align="center"> |
| | <img src="fp8_recipe.png" width="100%"/> |
| | <p> |
| | |
| | - Stable and Reproducible Performance |
| | - Efficient and Low memory Training |
| |
|
| | **Hyperparameters**: |
| |
|
| | <div align="center"> |
| |
|
| | | Parameter | Value | |
| | | :---: | :---: | |
| | | **Batch Size** | 64 | |
| | | **Learning Rate** | 5e-5 | |
| | | **Minimum Learning Rate** | 5e-6 | |
| | | **Weight Decay** | 0.05 | |
| | | **Context Length** | 32k | |
| |
|
| | </div> |
| |
|
| | The resulting model is the **InfiR2-1.5B-Instruct-FP8**. |
| |
|
| | ## π InfiR2 Model Series |
| |
|
| | The InfiR2 framework offers multiple variants model with different size and training strategy: |
| |
|
| | - **1.5B** |
| | - [InfiR2-1.5B-base-FP8](https://huggingface.co/InfiX-ai/InfiR2-1.5B-base-FP8): *Continue pretrain on Qwen2.5-1.5B-base* |
| | - [InfiR2-1.5B-Instruct-FP8](https://huggingface.co/InfiX-ai/InfiR2-1.5B-Instruct-FP8): *Supervised fine-tuning on InfiR2-1.5B-base-FP8 with [InfiAlign dataset](https://huggingface.co/papers/2508.05496)* |
| | - **7B** |
| | - [InfiR2-7B-base-FP8](https://huggingface.co/InfiX-ai/InfiR2-7B-base-FP8): *Continue pretrain on Qwen2.5-7B-base* |
| | - [InfiR2-7B-Instruct-FP8](https://huggingface.co/InfiX-ai/InfiR2-7B-Instruct-FP8): *Supervised fine-tuning on InfiR2-7B-base-FP8 with [InfiAlign dataset](https://huggingface.co/papers/2508.05496)* |
| | - [InfiR2-R1-7B-FP8-Preview](https://huggingface.co/InfiX-ai/InfiR2-R1-7B-FP8-Preview): *Multi-stage FP8 Reinforcement Learning* |
| |
|
| | ## π Model Performance |
| | Below is the performance comparison of InfiR2-1.5B-Instruct-FP8 on reasoning benchmarks. Note: 'w. InfiAlign' denotes Supervised Fine-Tuning (SFT) using the InfiAlign dataset. |
| |
|
| | <div align="center"> |
| |
|
| | <table> |
| | <thead> |
| | <tr> |
| | <th align="left">Model</th> |
| | <th align="center">AIME 25</th> |
| | <th align="center">AIME 24</th> |
| | <th align="center">GPQA</th> |
| | <th align="center">LiveCodeBench v5</th> |
| | </tr> |
| | </thead> |
| | <tbody> |
| | <tr> |
| | <td align="left"><strong>Deepseek-Distill-Qwen-1.5B</strong></td> |
| | <td align="center">21.35</td> |
| | <td align="center">26.87</td> |
| | <td align="center">32.26</td> |
| | <td align="center">18.50</td> |
| | </tr> |
| | <tr> |
| | <td align="left"><strong>Qwen2.5-1.5B-base (w. InfiAlign)</strong></td> |
| | <td align="center">14.58</td> |
| | <td align="center">10.52</td> |
| | <td align="center">28.98</td> |
| | <td align="center">12.99</td> |
| | </tr> |
| | <tr> |
| | <td align="left"><strong>InfiR2-1.5B-Instruct-FP8</strong></td> |
| | <td align="center">18.45</td> |
| | <td align="center">17.39</td> |
| | <td align="center">29.48</td> |
| | <td align="center">17.10</td> |
| | </tr> |
| | </tbody> |
| | </table> |
| | |
| | </div> |
| |
|
| |
|
| | ## π Quick Start |
| |
|
| | ```python |
| | from vllm import LLM, SamplingParams |
| | import torch |
| | import os |
| | |
| | MODEL_NAME = "InfiX-ai/InfiR2-1.5B-Instruct-FP8" |
| | |
| | prompt_text = "Briefly explain what a black hole is, and provide two interesting facts." |
| | |
| | MAX_NEW_TOKENS = 256 |
| | TEMPERATURE = 0.8 |
| | DO_SAMPLE = True |
| | |
| | llm = LLM( |
| | model=MODEL_NAME, |
| | dtype="auto", |
| | ) |
| | |
| | sampling_params = SamplingParams( |
| | n=1, |
| | temperature=TEMPERATURE, |
| | max_tokens=MAX_NEW_TOKENS, |
| | ) |
| | |
| | tokenizer = llm.get_tokenizer() |
| | messages = [ |
| | {"role": "user", "content": prompt_text} |
| | ] |
| | prompt_formatted = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) |
| | |
| | outputs = llm.generate( |
| | prompt_formatted, |
| | sampling_params |
| | ) |
| | |
| | generated_text = outputs[0].outputs[0].text |
| | |
| | llm_response = generated_text.strip() |
| | |
| | print("\n" + "="*70) |
| | print(f"Prompt: \n{prompt_text}") |
| | print("-" * 70) |
| | print(f"(LLM Response): \n{llm_response}") |
| | print("="*70) |
| | ``` |
| |
|
| | ## π Model Download |
| |
|
| | ```bash |
| | # Create a directory for models |
| | mkdir -p ./models |
| | # Download InfiR2-1.5B-Instruct-FP8 model |
| | huggingface-cli download --resume-download InfiX-ai/InfiR2-1.5B-Instruct-FP8 --local-dir ./models/InfiR2-1.5B-Instruct-FP8 |
| | ``` |
| | ## π― Intended Uses |
| |
|
| | ### β
Direct Use |
| |
|
| | This model is intended for research and commercial use. Example use cases include: |
| |
|
| | - Instruction following |
| | - Mathematical reasoning |
| | - Code generation |
| | - General reasoning |
| |
|
| | ### β Out-of-Scope Use |
| |
|
| | The model should **not** be used for: |
| |
|
| | - Generating harmful, offensive, or inappropriate content |
| | - Creating misleading information |
| |
|
| | ## π Acknowledgements |
| |
|
| | * We would like to express our gratitude for the following open-source projects: [Slime](https://github.com/THUDM/slime), [Megatron](https://github.com/NVIDIA/Megatron-LM), [TransformerEngine](https://github.com/NVIDIA/TransformerEngine) and [Qwen2.5](https://github.com/QwenLM/Qwen2.5-Math). |
| |
|
| | ## π Citation |
| |
|
| | If you find our work useful, please cite: |
| |
|
| | ```bibtex |
| | @misc{wang2025infir2comprehensivefp8training, |
| | title={InfiR2: A Comprehensive FP8 Training Recipe for Reasoning-Enhanced Language Models}, |
| | author={Wenjun Wang and Shuo Cai and Congkai Xie and Mingfa Feng and Yiming Zhang and Zhen Li and Kejing Yang and Ming Li and Jiannong Cao and Hongxia Yang}, |
| | year={2025}, |
| | eprint={2509.22536}, |
| | archivePrefix={arXiv}, |
| | primaryClass={cs.CL}, |
| | url={[https://arxiv.org/abs/2509.22536](https://arxiv.org/abs/2509.22536)}, |
| | } |
| | ``` |