Update paper link and model card title to Kwai Keye-VL 1.5 Technical Report

This PR updates the model card to reflect the [Kwai Keye-VL 1.5 Technical Report](https://huggingface.co/papers/2509.01563). This includes updating the main title, the technical report link in the header, and the citation information to align with the specified paper.

Files changed (1) hide show

README.md +7 -7

README.md CHANGED Viewed

@@ -8,13 +8,13 @@ tags:
 - multimodal
 ---
-# Kwai Keye-VL
 <div align="center">
   <img src="asset/keye_logo_2.png" width="100%" alt="Kwai Keye-VL Logo">
 </div>
-<font size=3><div align='center' >  [[🍎 Home Page](https://kwai-keye.github.io/)] [[📖 Technical Report](https://huggingface.co/papers/2507.01949)] [[📊 Models](https://huggingface.co/Kwai-Keye)] [[🚀 Demo](https://huggingface.co/spaces/Kwai-Keye/Keye-VL-8B-Preview)] [[💻 Code](https://github.com/Kwai-Keye/Keye)] </div></font>
 ## Abstract
@@ -461,7 +461,7 @@ The post-training phase of Kwai Keye is meticulously designed into two phases wi
   - Training Strategy: Uses a mix-mode GRPO algorithm for reinforcement learning, where reward signals evaluate both the correctness of results and the consistency of the process and results, ensuring synchronized optimization of reasoning processes and final outcomes.
 - **Step II.2: Iterative Alignment**
   - Objective: Address common issues like repetitive crashes and poor logic in model-generated content, and enable spontaneous reasoning mode selection to enhance final performance and stability.
-  - Data Composition: Constructs preference data through Rejection Fine-Tuning (RFT), combining rule-based scoring (judging repetition, instruction following, etc.) and model scoring (cognitive scores provided by large models) to rank various model responses, building a high-quality preference dataset.
   - Training Strategy: Multi-round iterative optimization with the constructed "good/bad" preference data pairs through the MPO algorithm. This aims to correct model generation flaws and ultimately enable it to intelligently and adaptively choose whether to activate deep reasoning modes based on problem complexity.
 </details>
@@ -479,14 +479,14 @@ The post-training phase of Kwai Keye is meticulously designed into two phases wi
 If you find our work helpful for your research, please consider citing our work.
 ```bibtex
-@misc{kwaikeyeteam2025kwaikeyevltechnicalreport,
-      title={Kwai Keye-VL Technical Report},
       author={Kwai Keye Team},
       year={2025},
-      eprint={2507.01949},
       archivePrefix={arXiv},
       primaryClass={cs.CV},
-      url={https://arxiv.org/abs/2507.01949},
 }
 ```

 - multimodal
 ---
+# Kwai Keye-VL 1.5
 <div align="center">
   <img src="asset/keye_logo_2.png" width="100%" alt="Kwai Keye-VL Logo">
 </div>
+<font size=3><div align='center' >  [[🍎 Home Page](https://kwai-keye.github.io/)] [[📖 Technical Report](https://huggingface.co/papers/2509.01563)] [[📊 Models](https://huggingface.co/Kwai-Keye)] [[🚀 Demo](https://huggingface.co/spaces/Kwai-Keye/Keye-VL-8B-Preview)] [[💻 Code](https://github.com/Kwai-Keye/Keye)] </div></font>
 ## Abstract
   - Training Strategy: Uses a mix-mode GRPO algorithm for reinforcement learning, where reward signals evaluate both the correctness of results and the consistency of the process and results, ensuring synchronized optimization of reasoning processes and final outcomes.
 - **Step II.2: Iterative Alignment**
   - Objective: Address common issues like repetitive crashes and poor logic in model-generated content, and enable spontaneous reasoning mode selection to enhance final performance and stability.
+  - Data Composition: Constructs preference data through Rejection Fine-Tuning (RFT), combining rule-based scoring (judging repetition, instruction following, etc.) and model scoring (cognitive scores provided by large models) and ranking various model responses, building a high-quality preference dataset.
   - Training Strategy: Multi-round iterative optimization with the constructed "good/bad" preference data pairs through the MPO algorithm. This aims to correct model generation flaws and ultimately enable it to intelligently and adaptively choose whether to activate deep reasoning modes based on problem complexity.
 </details>
 If you find our work helpful for your research, please consider citing our work.
 ```bibtex
+@misc{kwaikeyeteam2025kwaikeyevl15technicalreport,
+      title={Kwai Keye-VL 1.5 Technical Report},
       author={Kwai Keye Team},
       year={2025},
+      eprint={2509.01563},
       archivePrefix={arXiv},
       primaryClass={cs.CV},
+      url={https://arxiv.org/abs/2509.01563},
 }
 ```