Yuan-lab commited on
Commit
f1f1e12
·
verified ·
1 Parent(s): 758db8a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -17
README.md CHANGED
@@ -33,7 +33,7 @@ Yuan3.0 Ultra employs a unified multimodal model architecture, integrating a vis
33
 
34
  Yuan3.0 Ultra enhances the Reflection Inhibition Reward Mechanism (RIRM) proposed in <a href="https://github.com/Yuan-lab-LLM/Yuan3.0" target="_blank">**Yuan3.0 Flash**</a>. By incorporating reward constraints based on the number of reflection steps, the model actively reduces ineffective reflections after arriving at the "first correct answer," while retaining the necessary reasoning depth for complex problems. This approach effectively mitigates the "overthinking" phenomenon in fast-thinking reinforcement learning. Training results demonstrate that under this controlled fast-thinking strategy, the model’s accuracy improves significantly, while the number of tokens generated during reasoning continually decreases—achieving simultaneous gains in both accuracy and computational efficiency.
35
 
36
- Additionally, the <a href="./Docs/Yuan3.0_Ultra Paper.pdf">**technical report**</a> for Yuan3.0 Ultra has been released, which provides more detailed technical specifications and evaluation results.
37
 
38
  <div align=center> <img src="https://huggingface.co/YuanLabAI/Yuan3.0-Ultra/resolve/main/docs/Yuan3.0-Ultra-architecture.png" width="80%" />
39
 
@@ -204,22 +204,7 @@ Spider 1.0 and BIRD are two major benchmarks in the Text-to-SQL domain. Yuan3.0
204
  | **Yuan3.0 Ultra** | **83.9** | 39.2 |
205
 
206
 
207
-
208
-
209
- ## 6. Quick Start
210
-
211
- ### 6.1 Yuan3.0 Ultra Inference
212
-
213
- Yuan3.0 Ultra supports both bfloat16 and int4 quantized models. For usage details, please refer to [QuickStart](vllm/README_Yuan.md).
214
-
215
-
216
- ### 6.2 Yuan3.0 Ultra Training
217
-
218
- We provide supervised fine-tuning scripts and reinforcement learning scripts for Yuan3.0 Ultra. Please refer to the fine-tuning training [documentation](rlhf/docs/instruct_tuning.md) and reinforcement learning [documentation](rlhf/docs/RL_training.md).
219
-
220
-
221
-
222
- ## 7. License
223
  Use of Yuan 3.0 code and models must comply with the [Yuan 3.0 Model License Agreement](https://github.com/Yuan-lab-LLM/Yuan3.0?tab=License-1-ov-file). Yuan 3.0 models support commercial use and do not require an application for authorization. Please familiarize yourself with and adhere to the agreement. Do not use the open-source models, code, or any derivatives produced from this open-source project for any purposes that may cause harm to the nation or society, or for any services that have not undergone safety assessment and registration.
224
 
225
  Although measures have been taken during training to ensure data compliance and accuracy to the best of our ability, given the enormous scale of model parameters and the influence of probabilistic randomness, we cannot guarantee the accuracy of generated outputs, and models are susceptible to being misled by input instructions. This project assumes no responsibility for data security risks, public opinion risks, or any risks and liabilities arising from the model being misled, misused, disseminated, or improperly exploited due to the use of open-source models and code. You shall bear full and sole responsibility for all risks and consequences arising from your use, copying, distribution, and modification of this open-source project.
 
33
 
34
  Yuan3.0 Ultra enhances the Reflection Inhibition Reward Mechanism (RIRM) proposed in <a href="https://github.com/Yuan-lab-LLM/Yuan3.0" target="_blank">**Yuan3.0 Flash**</a>. By incorporating reward constraints based on the number of reflection steps, the model actively reduces ineffective reflections after arriving at the "first correct answer," while retaining the necessary reasoning depth for complex problems. This approach effectively mitigates the "overthinking" phenomenon in fast-thinking reinforcement learning. Training results demonstrate that under this controlled fast-thinking strategy, the model’s accuracy improves significantly, while the number of tokens generated during reasoning continually decreases—achieving simultaneous gains in both accuracy and computational efficiency.
35
 
36
+ Additionally, the <a href="https://github.com/Yuan-lab-LLM/Yuan3.0-Ultra/blob/main/Docs/Yuan3.0_Ultra%20Paper.pdf">**technical report**</a> for Yuan3.0 Ultra has been released, which provides more detailed technical specifications and evaluation results.
37
 
38
  <div align=center> <img src="https://huggingface.co/YuanLabAI/Yuan3.0-Ultra/resolve/main/docs/Yuan3.0-Ultra-architecture.png" width="80%" />
39
 
 
204
  | **Yuan3.0 Ultra** | **83.9** | 39.2 |
205
 
206
 
207
+ ## 6. License
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
208
  Use of Yuan 3.0 code and models must comply with the [Yuan 3.0 Model License Agreement](https://github.com/Yuan-lab-LLM/Yuan3.0?tab=License-1-ov-file). Yuan 3.0 models support commercial use and do not require an application for authorization. Please familiarize yourself with and adhere to the agreement. Do not use the open-source models, code, or any derivatives produced from this open-source project for any purposes that may cause harm to the nation or society, or for any services that have not undergone safety assessment and registration.
209
 
210
  Although measures have been taken during training to ensure data compliance and accuracy to the best of our ability, given the enormous scale of model parameters and the influence of probabilistic randomness, we cannot guarantee the accuracy of generated outputs, and models are susceptible to being misled by input instructions. This project assumes no responsibility for data security risks, public opinion risks, or any risks and liabilities arising from the model being misled, misused, disseminated, or improperly exploited due to the use of open-source models and code. You shall bear full and sole responsibility for all risks and consequences arising from your use, copying, distribution, and modification of this open-source project.