Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,16 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
這是NYCU深度學習課程KAGGLE #3的模型,使用Qwen2.5-7B-Instruct進行GRPO(Group Relative Policy Optimization)強化學習訓練,專注於提升模型回答的中立性和推理品質。
|
| 2 |
|
| 3 |
## 模型資訊
|
|
@@ -39,7 +52,7 @@ model = PeftModel.from_pretrained(
|
|
| 39 |
)
|
| 40 |
|
| 41 |
# 載入tokenizer
|
| 42 |
-
tokenizer = AutoTokenizer.from_pretrained(\"
|
| 43 |
|
| 44 |
# 使用中立性提示
|
| 45 |
prompt = \"\"\"請從多元視角分析以下問題:
|
|
@@ -141,6 +154,7 @@ final_answer = extract_answer_from_reasoning(reasoning)
|
|
| 141 |
|
| 142 |
* Ray Tsai (110651053)
|
| 143 |
* NYCU 深度學習課程 2025
|
|
|
|
| 144 |
## 授權
|
| 145 |
|
| 146 |
本模型遵循Qwen2.5的原始授權條款。
|
|
@@ -150,5 +164,4 @@ final_answer = extract_answer_from_reasoning(reasoning)
|
|
| 150 |
* [KAGGLE #1 - SFT模型](https://huggingface.co/RayTsai/chinese-llm-mcq-qwen2-5-14b)
|
| 151 |
* [KAGGLE #2 - 推理鏈模型](https://huggingface.co/RayTsai/Kaggle_2)
|
| 152 |
* [技術報告](https://github.com/RayTsai/chinese-llm-neutrality)
|
| 153 |
-
* [NYCU深度學習課程](https://www.nycu.edu.tw)
|
| 154 |
-
}
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language: zh
|
| 3 |
+
license: apache-2.0
|
| 4 |
+
base_model: Qwen/Qwen2.5-7B-Instruct
|
| 5 |
+
tags:
|
| 6 |
+
- generated_from_trainer
|
| 7 |
+
- lora
|
| 8 |
+
- peft
|
| 9 |
+
library_name: peft
|
| 10 |
+
---
|
| 11 |
+
|
| 12 |
+
# Chinese LLM MCQ Model with Neutrality Optimization - KAGGLE #3
|
| 13 |
+
|
| 14 |
這是NYCU深度學習課程KAGGLE #3的模型,使用Qwen2.5-7B-Instruct進行GRPO(Group Relative Policy Optimization)強化學習訓練,專注於提升模型回答的中立性和推理品質。
|
| 15 |
|
| 16 |
## 模型資訊
|
|
|
|
| 52 |
)
|
| 53 |
|
| 54 |
# 載入tokenizer
|
| 55 |
+
tokenizer = AutoTokenizer.from_pretrained(\"RayTsai/Kaggle_3_GRPO_Neutrality\")
|
| 56 |
|
| 57 |
# 使用中立性提示
|
| 58 |
prompt = \"\"\"請從多元視角分析以下問題:
|
|
|
|
| 154 |
|
| 155 |
* Ray Tsai (110651053)
|
| 156 |
* NYCU 深度學習課程 2025
|
| 157 |
+
|
| 158 |
## 授權
|
| 159 |
|
| 160 |
本模型遵循Qwen2.5的原始授權條款。
|
|
|
|
| 164 |
* [KAGGLE #1 - SFT模型](https://huggingface.co/RayTsai/chinese-llm-mcq-qwen2-5-14b)
|
| 165 |
* [KAGGLE #2 - 推理鏈模型](https://huggingface.co/RayTsai/Kaggle_2)
|
| 166 |
* [技術報告](https://github.com/RayTsai/chinese-llm-neutrality)
|
| 167 |
+
* [NYCU深度學習課程](https://www.nycu.edu.tw)
|
|
|