Update README.md
Browse files
README.md
CHANGED
|
@@ -1,11 +1,253 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
|
| 2 |
+
|
| 3 |
+
# Hicoder-R1-Distill-Gemma-27B
|
| 4 |
+
|
| 5 |
+
|
| 6 |
+
### Model Overview
|
| 7 |
+
|
| 8 |
+
**Hicoder-R1-Distill-Gemma-27B** is a large language model fine-tuned from Google's **Gemma-2 27B** (*Note: Assuming Gemma-2 27B as Gemma-3 is not publicly released*) base model. This model is specifically optimized for **Chain-of-Thought (CoT) reasoning** and **code generation** tasks. The "Distill" in the name suggests that knowledge distillation techniques may have been employed during the fine-tuning process, potentially leveraging outputs from a more powerful teacher model to enhance its reasoning and coding abilities.
|
| 9 |
+
|
| 10 |
+
* **Base Model:** google/gemma-2-27b (or specify the exact variant used, e.g., gemma-2-27b-it)
|
| 11 |
+
* **Fine-tuned by:** [Your Name/Organization]
|
| 12 |
+
* **Focus Areas:** Chain-of-Thought (CoT), Code Generation, Code Explanation, Debugging
|
| 13 |
+
* **Language:** Primarily English for prompts and reasoning, generates code in multiple languages.
|
| 14 |
+
|
| 15 |
+
### Key Features
|
| 16 |
+
|
| 17 |
+
* **Enhanced CoT Reasoning:** Explicitly trained to break down complex problems into intermediate steps before providing a final answer, particularly useful for complex coding or algorithmic tasks.
|
| 18 |
+
* **Strong Coding Capabilities:** Generates, explains, debugs, and translates code across various programming languages (e.g., Python, JavaScript, Java, C++, SQL, etc.).
|
| 19 |
+
* **Gemma-2 Foundation:** Built upon the powerful and efficient architecture of Google's Gemma-2 27B model.
|
| 20 |
+
* **Distillation Enhanced (Implied):** Potentially benefits from knowledge distillation for improved performance relative to standard fine-tuning on the target tasks.
|
| 21 |
+
|
| 22 |
+
### How to Use
|
| 23 |
+
|
| 24 |
+
You can use this model with the Hugging Face `transformers` library.
|
| 25 |
+
|
| 26 |
+
```python
|
| 27 |
+
import torch
|
| 28 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
| 29 |
+
|
| 30 |
+
# Specify the path to your fine-tuned model (local or Hugging Face Hub ID)
|
| 31 |
+
model_id = "[Your Hugging Face Model ID or Local Path, e.g., YourUsername/Hicoder-R1-Distill-Gemma-27B]"
|
| 32 |
+
|
| 33 |
+
# Load tokenizer and model
|
| 34 |
+
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
| 35 |
+
model = AutoModelForCausalLM.from_pretrained(
|
| 36 |
+
model_id,
|
| 37 |
+
torch_dtype=torch.bfloat16, # Use bfloat16 for efficiency if supported
|
| 38 |
+
device_map="auto" # Automatically distribute across available GPUs
|
| 39 |
+
)
|
| 40 |
+
|
| 41 |
+
# --- Example 1: Simple Code Generation ---
|
| 42 |
+
prompt_simple = "Write a Python function to calculate the factorial of a number."
|
| 43 |
+
# Note: Use the appropriate chat template if the base model requires it (e.g., Gemma-2 instruct)
|
| 44 |
+
# Example using Gemma-2 instruct template structure (adjust if needed):
|
| 45 |
+
messages_simple = [
|
| 46 |
+
{"role": "user", "content": prompt_simple}
|
| 47 |
+
]
|
| 48 |
+
input_ids_simple = tokenizer.apply_chat_template(messages_simple, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
|
| 49 |
+
|
| 50 |
+
outputs_simple = model.generate(
|
| 51 |
+
input_ids_simple,
|
| 52 |
+
max_new_tokens=150,
|
| 53 |
+
do_sample=True,
|
| 54 |
+
temperature=0.7,
|
| 55 |
+
top_k=50,
|
| 56 |
+
top_p=0.95
|
| 57 |
+
)
|
| 58 |
+
response_simple = tokenizer.decode(outputs_simple[0][input_ids_simple.shape[1]:], skip_special_tokens=True)
|
| 59 |
+
print("--- Simple Code Generation ---")
|
| 60 |
+
print(response_simple)
|
| 61 |
+
|
| 62 |
+
# --- Example 2: Code Generation with CoT ---
|
| 63 |
+
prompt_cot = """Think step-by-step to write a Python function that finds all prime numbers up to a given integer 'n' using the Sieve of Eratosthenes algorithm. Then, provide the function.
|
| 64 |
+
|
| 65 |
+
Let's break this down:
|
| 66 |
+
1. Understand the Sieve of Eratosthenes.
|
| 67 |
+
2. Outline the steps needed in the function.
|
| 68 |
+
3. Write the Python code based on the outline."""
|
| 69 |
+
|
| 70 |
+
messages_cot = [
|
| 71 |
+
{"role": "user", "content": prompt_cot}
|
| 72 |
+
]
|
| 73 |
+
input_ids_cot = tokenizer.apply_chat_template(messages_cot, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
|
| 74 |
+
|
| 75 |
+
outputs_cot = model.generate(
|
| 76 |
+
input_ids_cot,
|
| 77 |
+
max_new_tokens=500, # Allow more tokens for CoT + code
|
| 78 |
+
do_sample=True,
|
| 79 |
+
temperature=0.6,
|
| 80 |
+
top_k=50,
|
| 81 |
+
top_p=0.95
|
| 82 |
+
)
|
| 83 |
+
response_cot = tokenizer.decode(outputs_cot[0][input_ids_cot.shape[1]:], skip_special_tokens=True)
|
| 84 |
+
print("\n--- Code Generation with CoT ---")
|
| 85 |
+
print(response_cot)
|
| 86 |
+
|
| 87 |
+
```
|
| 88 |
+
|
| 89 |
+
**Prompting:** For best results, especially when seeking CoT reasoning, explicitly ask the model to "think step-by-step" or "provide your reasoning process before the code". Use the chat template associated with the base Gemma-2 model unless you have defined a custom one during fine-tuning.
|
| 90 |
+
|
| 91 |
+
### Limitations and Bias
|
| 92 |
+
|
| 93 |
+
* This model is based on Gemma-2, and inherits its capabilities and limitations.
|
| 94 |
+
* While fine-tuned for coding, it may still generate incorrect, inefficient, or insecure code. **Always review and test generated code thoroughly.**
|
| 95 |
+
* The model's knowledge is limited to its training data cutoff.
|
| 96 |
+
* Like all LLMs, it may exhibit biases present in the underlying training data.
|
| 97 |
+
* Chain-of-Thought reasoning may not always be perfect or logical.
|
| 98 |
+
|
| 99 |
+
### License
|
| 100 |
+
|
| 101 |
+
The license for this model depends on the base Gemma-2 model's license and any additional terms you impose. The Gemma-2 models are typically governed by the "Gemma Terms of Use". Please consult the specific license file included with the model or the Gemma Terms of Use.
|
| 102 |
+
|
| 103 |
+
* **Gemma Terms of Use:** [Link to Google's Gemma Terms, e.g., https://ai.google.dev/gemma/terms]
|
| 104 |
+
* **Fine-tuning Specific License (if any):** [Specify if you add Apache 2.0, MIT, etc., or state it follows the base model license]
|
| 105 |
+
|
| 106 |
+
### Citation
|
| 107 |
+
|
| 108 |
+
If you use this model in your research or work, please consider citing:
|
| 109 |
+
|
| 110 |
+
```bibtex
|
| 111 |
+
@misc{hicoder_r1_distill_gemma_27b_[year],
|
| 112 |
+
title={Hicoder-R1-Distill-Gemma-27B: A Chain-of-Thought and Code Generation Focused Model},
|
| 113 |
+
author={[Your Name/Organization]},
|
| 114 |
+
year={[Year of Release]},
|
| 115 |
+
howpublished={\url{[Link to Model Hub or Repository]}}
|
| 116 |
+
}
|
| 117 |
+
|
| 118 |
+
@misc{gemma2_2024,
|
| 119 |
+
title={Gemma 2 Technical Report},
|
| 120 |
+
author={Gemma Team, Google},
|
| 121 |
+
year={2024},
|
| 122 |
+
howpublished={\url{https://ai.google.dev/gemma}} % Replace with actual Gemma 2 paper/report link if available
|
| 123 |
+
}
|
| 124 |
+
```
|
| 125 |
+
|
| 126 |
+
### Contact
|
| 127 |
+
|
| 128 |
+
For questions, feedback, or issues, please contact tonyli288@gmail.com.
|
| 129 |
+
|
| 130 |
---
|
| 131 |
+
---
|
| 132 |
+
|
| 133 |
+
## 中文版 (Chinese Version)
|
| 134 |
+
|
| 135 |
+
### 模型概述
|
| 136 |
+
|
| 137 |
+
**Hicoder-R1-Distill-Gemma-27B** 是一个基于 Google **Gemma-2 27B** (*注意:假设基于 Gemma-2 27B,因为 Gemma-3 尚未公开发布*) 基础模型进行微调的大型语言模型。该模型专门针对**思维链 (Chain-of-Thought, CoT) 推理**和**代码生成**任务进行了优化。名称中的 "Distill" 暗示在微调过程中可能采用了知识蒸馏技术,或许利用了更强大的教师模型的输出来增强其推理和编码能力。
|
| 138 |
+
|
| 139 |
+
* **基础模型:** google/gemma-2-27b (或指定使用的确切变体,例如 gemma-2-27b-it)
|
| 140 |
+
* **微调者:** [您的姓名/组织名称]
|
| 141 |
+
* **专注领域:** 思维链 (CoT), 代码生成, 代码解释, 代码调试
|
| 142 |
+
* **语言:** 主要使用英文进行提示和推理,可生成多种编程语言的代码。
|
| 143 |
+
|
| 144 |
+
### 主要特性
|
| 145 |
+
|
| 146 |
+
* **增强的 CoT 推理能力:** 经过专门训练,能够在提供最终答案之前将复杂问题分解为中间步骤,这对于复杂的编码或算法任务特别有用。
|
| 147 |
+
* **强大的编码能力:** 能生成、解释、调试和翻译多种编程语言(如 Python, JavaScript, Java, C++, SQL 等)的代码。
|
| 148 |
+
* **基于 Gemma-2:** 构建于 Google 强大且高效的 Gemma-2 27B 模型架构之上。
|
| 149 |
+
* **蒸馏增强 (推测):** 可能受益于知识蒸馏,相对于在目标任务上的标准微调,性能有所提升。
|
| 150 |
+
|
| 151 |
+
### 如何使用
|
| 152 |
+
|
| 153 |
+
您可以通过 Hugging Face `transformers` 库来使用此模型。
|
| 154 |
+
|
| 155 |
+
```python
|
| 156 |
+
import torch
|
| 157 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
| 158 |
+
|
| 159 |
+
# 指定您的微调模型的路径 (本地路径或 Hugging Face Hub ID)
|
| 160 |
+
model_id = "[您的 Hugging Face 模型 ID 或本地路径, 例如: YourUsername/Hicoder-R1-Distill-Gemma-27B]"
|
| 161 |
+
|
| 162 |
+
# 加载分词器和模型
|
| 163 |
+
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
| 164 |
+
model = AutoModelForCausalLM.from_pretrained(
|
| 165 |
+
model_id,
|
| 166 |
+
torch_dtype=torch.bfloat16, # 如果硬件支持,使用 bfloat16 以提高效率
|
| 167 |
+
device_map="auto" # 自动将模型分配到可用的 GPU 上
|
| 168 |
+
)
|
| 169 |
+
|
| 170 |
+
# --- 示例 1: 简单代码生成 ---
|
| 171 |
+
prompt_simple = "编写一个 Python 函数来计算一个数的阶乘。"
|
| 172 |
+
# 注意: 如果基础模型需要,请使用相应的聊天模板 (例如 Gemma-2 instruct)
|
| 173 |
+
# 使用 Gemma-2 instruct 模板结构的示例 (如果需要请调整):
|
| 174 |
+
messages_simple = [
|
| 175 |
+
{"role": "user", "content": prompt_simple}
|
| 176 |
+
]
|
| 177 |
+
input_ids_simple = tokenizer.apply_chat_template(messages_simple, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
|
| 178 |
+
|
| 179 |
+
outputs_simple = model.generate(
|
| 180 |
+
input_ids_simple,
|
| 181 |
+
max_new_tokens=150,
|
| 182 |
+
do_sample=True,
|
| 183 |
+
temperature=0.7,
|
| 184 |
+
top_k=50,
|
| 185 |
+
top_p=0.95
|
| 186 |
+
)
|
| 187 |
+
response_simple = tokenizer.decode(outputs_simple[0][input_ids_simple.shape[1]:], skip_special_tokens=True)
|
| 188 |
+
print("--- 简单代码生成 ---")
|
| 189 |
+
print(response_simple)
|
| 190 |
+
|
| 191 |
+
# --- 示例 2: 带 CoT 的代码生成 ---
|
| 192 |
+
prompt_cot = """请逐步思考如何编写一个 Python 函数,使用埃拉托斯特尼筛法 (Sieve of Eratosthenes) 找出小于等于给定整数 'n' 的所有素数。然后,提供该函数。
|
| 193 |
+
|
| 194 |
+
让我们分解一下步骤:
|
| 195 |
+
1. 理解埃拉托斯特尼筛法的原理。
|
| 196 |
+
2. 概述函数中需要的步骤。
|
| 197 |
+
3. 基于概述编写 Python 代码。"""
|
| 198 |
+
|
| 199 |
+
messages_cot = [
|
| 200 |
+
{"role": "user", "content": prompt_cot}
|
| 201 |
+
]
|
| 202 |
+
input_ids_cot = tokenizer.apply_chat_template(messages_cot, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
|
| 203 |
+
|
| 204 |
+
outputs_cot = model.generate(
|
| 205 |
+
input_ids_cot,
|
| 206 |
+
max_new_tokens=500, # 为 CoT + 代码允许更多 token
|
| 207 |
+
do_sample=True,
|
| 208 |
+
temperature=0.6,
|
| 209 |
+
top_k=50,
|
| 210 |
+
top_p=0.95
|
| 211 |
+
)
|
| 212 |
+
response_cot = tokenizer.decode(outputs_cot[0][input_ids_cot.shape[1]:], skip_special_tokens=True)
|
| 213 |
+
print("\n--- 带 CoT 的代码生成 ---")
|
| 214 |
+
print(response_cot)
|
| 215 |
+
|
| 216 |
+
```
|
| 217 |
+
|
| 218 |
+
**提示词技�� (Prompting):** 为了获得最佳效果,尤其是在需要 CoT 推理时,请明确要求模型“逐步思考”或“在代码前提供你的推理过程”。除非您在微调过程中定义了自定义模板,否则请使用与基础 Gemma-2 模型关联的聊天模板。
|
| 219 |
+
|
| 220 |
+
### 局限性与偏见
|
| 221 |
+
|
| 222 |
+
* 该模型基于 Gemma-2,继承了其能力和局限性。
|
| 223 |
+
* 尽管针对编码进行了微调,它仍可能生成不正确、低效或不安全的代码。**请务必仔细审查和测试生成的代码。**
|
| 224 |
+
* 模型的知识仅限于其训练数据的截止日期。
|
| 225 |
+
* 与所有大型语言模型一样,它可能表现出基础训练数据中存在的偏见。
|
| 226 |
+
* 思维链推理可能并非总是完美或符合逻辑。
|
| 227 |
+
|
| 228 |
+
### 许可证 (License)
|
| 229 |
+
|
| 230 |
+
该模型的许可证取决于基础 Gemma-2 模型的许可证以及您可能施加的任何附加条款。Gemma-2 模型通常受 "Gemma 使用条款" 的约束。请查阅模型附带的具体许可证文件或 Gemma 使用条款。
|
| 231 |
+
|
| 232 |
+
* **Gemma 使用条款:** [指向 Google Gemma 条款的链接, 例如: https://ai.google.dev/gemma/terms]
|
| 233 |
+
* **微调特定许可证 (如有):** [在此说明您是否添加了 Apache 2.0, MIT 等许可证,或声明其遵循基础模型的许可证]
|
| 234 |
+
|
| 235 |
+
### 引用
|
| 236 |
+
|
| 237 |
+
如果您在研究或工作中使用此模型,请考虑引用:
|
| 238 |
+
|
| 239 |
+
```bibtex
|
| 240 |
+
@misc{hicoder_r1_distill_gemma_27b_[年份],
|
| 241 |
+
title={Hicoder-R1-Distill-Gemma-27B: 一个专注于思维链和代码生成的模型},
|
| 242 |
+
author={[您的姓名/组织名称]},
|
| 243 |
+
year={[发布年份]},
|
| 244 |
+
howpublished={\url{[模型 Hub 或仓库的链接]}}
|
| 245 |
+
}
|
| 246 |
+
|
| 247 |
+
@misc{gemma2_2024,
|
| 248 |
+
title={Gemma 2 Technical Report},
|
| 249 |
+
author={Gemma Team, Google},
|
| 250 |
+
year={2024},
|
| 251 |
+
|
| 252 |
+
}
|
| 253 |
+
|