YuanLabAI
/

Yuan3.0-Flash-4bit

@@ -6,8 +6,8 @@
 <hr>
 <div align="center" style="line-height: 1;">
-  <a href="https://huggingface.co/YuanLabAI"><img alt="Hugging Face"
-    src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Yuan%20LLM-ffc107?color=ffc107&logoColor=white"/></a>
   <a href="https://www.modelscope.cn/profile/Yuanlab"><img alt="ModelScope"
     src="https://img.shields.io/badge/💾%20ModelScope-Yuan%20LLM-6b4fbb?color=6b4fbb&logoColor=white"/></a>
   <a href="https://x.com/Yuanlabai"><img alt="Twitter Follow"
@@ -21,15 +21,6 @@
 </div>
-<h4 align="center">
-    <p>
-        <a href="./README_ZH.md">简体中文</a> |
-        <b>English</b>
-    <p>
-</h4>
 -----
@@ -44,10 +35,9 @@
 Yuan 3.0 Flash, developed by the **YuanLab.ai team**, is a **40B parameter multimodal foundation model** that employs a Mixture of Experts (MoE) architecture, activating only approximately **3.7B parameters** per inference. Through innovative reinforcement learning training methods (RAPO), it significantly reduces inference token consumption while improving reasoning accuracy, exploring the innovative path of "less computation, higher intelligence" for large language models. We have also released the <a href="https://github.com/Yuan-lab-LLM/Yuan3.0/blob/main/docs/YUAN3.0_FLASH-paper.pdf" target="_blank">**technical report**</a> for the Yuan3.0 model, where you can find more detailed technical information and evaluation results.
-<div align=center> <img src=https://github.com/Yuan-lab-LLM/Yuan3.0/blob/main/docs/Yuan3.0-architecture.png width=80% />
 Fig.1: Yuan3.0 Multimodal Large Language Model Architecture
 </div>
 ### Core Features
@@ -62,10 +52,10 @@ Fig.1: Yuan3.0 Multimodal Large Language Model Architecture
 Yuan 3.0 Flash outperforms GPT-5.1 in enterprise-grade RAG, multimodal retrieval, table understanding, summary generation and other tasks. With 40B parameters, it achieves the reasoning accuracy of 235B/671B models while reducing token consumption by 50%-75%, providing enterprises with high-performance, low-cost large language model solutions.
-<div align=center> <img src=https://github.com/Yuan-lab-LLM/Yuan3.0/blob/main/docs/Yuan3.0-benchmarks.png width=80% />
-Fig.2: Yuan3.0 Flash Evaluation Results
 </div>
@@ -188,26 +178,3 @@ Summarization generation is a core requirement for historical information compre
 | **Yuan3.0 Flash** | **59.31** | 51.32 | 28.32 | 89.99 | 45.34 |
----
-## 6. Quick Start
-**6.1 Yuan3.0 Flash Inference**
-Yuan3.0 Flash supports bfloat16 and int4 quantized models. For specific usage methods, please refer to [QuickStart](vllm/README_Yuan.md)
-**6.2 Data Preprocessing**
-We provide data preprocessing scripts. Please refer to the data preprocessing [documentation](rlhf/docs/data_process.md).
-**6.3 Model Fine-tuning Training**
-We provide supervised fine-tuning scripts and reinforcement learning workflows for the Yuan3.0 Flash model. Please refer to the fine-tuning training [documentation](rlhf/docs/instruct_tuning.md) and reinforcement learning [documentation](rlhf/docs/RL_training.md).
-## 7. License Agreement
-The use of Yuan 3.0 code and models must comply with the [《Yuan 3.0 Model License Agreement》](https://github.com/Yuan-lab-LLM/Yuan3.0?tab=License-1-ov-file). The Yuan 3.0 model supports commercial use without requiring authorization application. Please understand and comply with the agreement, and do not use the open-source model and code, as well as derivatives generated based on the open-source project, for any purpose that may bring harm to the country and society, or for any service that has not undergone security assessment and filing.
-Although we have taken measures to ensure the compliance and accuracy of the data during model training, due to the large number of model parameters and the influence of probabilistic randomness, we cannot guarantee the accuracy of the output content, and the model is easily misled by input instructions. This project does not assume responsibility for data security and public opinion risks caused by open-source models and code, or any risks and responsibilities arising from the model being misled, abused, disseminated, or improperly used. You will independently bear all risks and consequences arising from the use, copying, distribution, and modification of the model through using this open-source project.

 <hr>
 <div align="center" style="line-height: 1;">
+  <a href="https://github.com/Yuan-lab-LLM/Yuan3.0"><img alt="GitHub"
+    src="https://img.shields.io/badge/GitHub-Yuan%203.0%20Repo-181717?logo=github&logoColor=white"/></a>
   <a href="https://www.modelscope.cn/profile/Yuanlab"><img alt="ModelScope"
     src="https://img.shields.io/badge/💾%20ModelScope-Yuan%20LLM-6b4fbb?color=6b4fbb&logoColor=white"/></a>
   <a href="https://x.com/Yuanlabai"><img alt="Twitter Follow"
 </div>
 -----
 Yuan 3.0 Flash, developed by the **YuanLab.ai team**, is a **40B parameter multimodal foundation model** that employs a Mixture of Experts (MoE) architecture, activating only approximately **3.7B parameters** per inference. Through innovative reinforcement learning training methods (RAPO), it significantly reduces inference token consumption while improving reasoning accuracy, exploring the innovative path of "less computation, higher intelligence" for large language models. We have also released the <a href="https://github.com/Yuan-lab-LLM/Yuan3.0/blob/main/docs/YUAN3.0_FLASH-paper.pdf" target="_blank">**technical report**</a> for the Yuan3.0 model, where you can find more detailed technical information and evaluation results.
+<div align="center">
+  <img src="https://huggingface.co/YuanLabAI/Yuan3.0-Flash-4bit/resolve/main/docs/Yuan3.0-architecture.png" width="80%" />
 Fig.1: Yuan3.0 Multimodal Large Language Model Architecture
 </div>
 ### Core Features
 Yuan 3.0 Flash outperforms GPT-5.1 in enterprise-grade RAG, multimodal retrieval, table understanding, summary generation and other tasks. With 40B parameters, it achieves the reasoning accuracy of 235B/671B models while reducing token consumption by 50%-75%, providing enterprises with high-performance, low-cost large language model solutions.
+<div align="center">
+  <img src="https://huggingface.co/YuanLabAI/Yuan3.0-Flash-4bit/resolve/main/docs/Yuan3.0-benchmarks.png" width="80%" />
+Fig.1: Yuan3.0 Multimodal Large Language Model Architecture
 </div>
 | **Yuan3.0 Flash** | **59.31** | 51.32 | 28.32 | 89.99 | 45.34 |