tonyli8623 commited on
Commit
76780a9
·
verified ·
1 Parent(s): f19366e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +252 -10
README.md CHANGED
@@ -1,11 +1,253 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
- datasets:
3
- - nvidia/OpenCodeReasoning
4
- language:
5
- - en
6
- - zh
7
- metrics:
8
- - accuracy
9
- base_model:
10
- - google/gemma-3-27b-it
11
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+
3
+ # Hicoder-R1-Distill-Gemma-27B
4
+
5
+
6
+ ### Model Overview
7
+
8
+ **Hicoder-R1-Distill-Gemma-27B** is a large language model fine-tuned from Google's **Gemma-2 27B** (*Note: Assuming Gemma-2 27B as Gemma-3 is not publicly released*) base model. This model is specifically optimized for **Chain-of-Thought (CoT) reasoning** and **code generation** tasks. The "Distill" in the name suggests that knowledge distillation techniques may have been employed during the fine-tuning process, potentially leveraging outputs from a more powerful teacher model to enhance its reasoning and coding abilities.
9
+
10
+ * **Base Model:** google/gemma-2-27b (or specify the exact variant used, e.g., gemma-2-27b-it)
11
+ * **Fine-tuned by:** [Your Name/Organization]
12
+ * **Focus Areas:** Chain-of-Thought (CoT), Code Generation, Code Explanation, Debugging
13
+ * **Language:** Primarily English for prompts and reasoning, generates code in multiple languages.
14
+
15
+ ### Key Features
16
+
17
+ * **Enhanced CoT Reasoning:** Explicitly trained to break down complex problems into intermediate steps before providing a final answer, particularly useful for complex coding or algorithmic tasks.
18
+ * **Strong Coding Capabilities:** Generates, explains, debugs, and translates code across various programming languages (e.g., Python, JavaScript, Java, C++, SQL, etc.).
19
+ * **Gemma-2 Foundation:** Built upon the powerful and efficient architecture of Google's Gemma-2 27B model.
20
+ * **Distillation Enhanced (Implied):** Potentially benefits from knowledge distillation for improved performance relative to standard fine-tuning on the target tasks.
21
+
22
+ ### How to Use
23
+
24
+ You can use this model with the Hugging Face `transformers` library.
25
+
26
+ ```python
27
+ import torch
28
+ from transformers import AutoTokenizer, AutoModelForCausalLM
29
+
30
+ # Specify the path to your fine-tuned model (local or Hugging Face Hub ID)
31
+ model_id = "[Your Hugging Face Model ID or Local Path, e.g., YourUsername/Hicoder-R1-Distill-Gemma-27B]"
32
+
33
+ # Load tokenizer and model
34
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
35
+ model = AutoModelForCausalLM.from_pretrained(
36
+ model_id,
37
+ torch_dtype=torch.bfloat16, # Use bfloat16 for efficiency if supported
38
+ device_map="auto" # Automatically distribute across available GPUs
39
+ )
40
+
41
+ # --- Example 1: Simple Code Generation ---
42
+ prompt_simple = "Write a Python function to calculate the factorial of a number."
43
+ # Note: Use the appropriate chat template if the base model requires it (e.g., Gemma-2 instruct)
44
+ # Example using Gemma-2 instruct template structure (adjust if needed):
45
+ messages_simple = [
46
+ {"role": "user", "content": prompt_simple}
47
+ ]
48
+ input_ids_simple = tokenizer.apply_chat_template(messages_simple, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
49
+
50
+ outputs_simple = model.generate(
51
+ input_ids_simple,
52
+ max_new_tokens=150,
53
+ do_sample=True,
54
+ temperature=0.7,
55
+ top_k=50,
56
+ top_p=0.95
57
+ )
58
+ response_simple = tokenizer.decode(outputs_simple[0][input_ids_simple.shape[1]:], skip_special_tokens=True)
59
+ print("--- Simple Code Generation ---")
60
+ print(response_simple)
61
+
62
+ # --- Example 2: Code Generation with CoT ---
63
+ prompt_cot = """Think step-by-step to write a Python function that finds all prime numbers up to a given integer 'n' using the Sieve of Eratosthenes algorithm. Then, provide the function.
64
+
65
+ Let's break this down:
66
+ 1. Understand the Sieve of Eratosthenes.
67
+ 2. Outline the steps needed in the function.
68
+ 3. Write the Python code based on the outline."""
69
+
70
+ messages_cot = [
71
+ {"role": "user", "content": prompt_cot}
72
+ ]
73
+ input_ids_cot = tokenizer.apply_chat_template(messages_cot, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
74
+
75
+ outputs_cot = model.generate(
76
+ input_ids_cot,
77
+ max_new_tokens=500, # Allow more tokens for CoT + code
78
+ do_sample=True,
79
+ temperature=0.6,
80
+ top_k=50,
81
+ top_p=0.95
82
+ )
83
+ response_cot = tokenizer.decode(outputs_cot[0][input_ids_cot.shape[1]:], skip_special_tokens=True)
84
+ print("\n--- Code Generation with CoT ---")
85
+ print(response_cot)
86
+
87
+ ```
88
+
89
+ **Prompting:** For best results, especially when seeking CoT reasoning, explicitly ask the model to "think step-by-step" or "provide your reasoning process before the code". Use the chat template associated with the base Gemma-2 model unless you have defined a custom one during fine-tuning.
90
+
91
+ ### Limitations and Bias
92
+
93
+ * This model is based on Gemma-2, and inherits its capabilities and limitations.
94
+ * While fine-tuned for coding, it may still generate incorrect, inefficient, or insecure code. **Always review and test generated code thoroughly.**
95
+ * The model's knowledge is limited to its training data cutoff.
96
+ * Like all LLMs, it may exhibit biases present in the underlying training data.
97
+ * Chain-of-Thought reasoning may not always be perfect or logical.
98
+
99
+ ### License
100
+
101
+ The license for this model depends on the base Gemma-2 model's license and any additional terms you impose. The Gemma-2 models are typically governed by the "Gemma Terms of Use". Please consult the specific license file included with the model or the Gemma Terms of Use.
102
+
103
+ * **Gemma Terms of Use:** [Link to Google's Gemma Terms, e.g., https://ai.google.dev/gemma/terms]
104
+ * **Fine-tuning Specific License (if any):** [Specify if you add Apache 2.0, MIT, etc., or state it follows the base model license]
105
+
106
+ ### Citation
107
+
108
+ If you use this model in your research or work, please consider citing:
109
+
110
+ ```bibtex
111
+ @misc{hicoder_r1_distill_gemma_27b_[year],
112
+ title={Hicoder-R1-Distill-Gemma-27B: A Chain-of-Thought and Code Generation Focused Model},
113
+ author={[Your Name/Organization]},
114
+ year={[Year of Release]},
115
+ howpublished={\url{[Link to Model Hub or Repository]}}
116
+ }
117
+
118
+ @misc{gemma2_2024,
119
+ title={Gemma 2 Technical Report},
120
+ author={Gemma Team, Google},
121
+ year={2024},
122
+ howpublished={\url{https://ai.google.dev/gemma}} % Replace with actual Gemma 2 paper/report link if available
123
+ }
124
+ ```
125
+
126
+ ### Contact
127
+
128
+ For questions, feedback, or issues, please contact tonyli288@gmail.com.
129
+
130
  ---
131
+ ---
132
+
133
+ ## 中文版 (Chinese Version)
134
+
135
+ ### 模型概述
136
+
137
+ **Hicoder-R1-Distill-Gemma-27B** 是一个基于 Google **Gemma-2 27B** (*注意:假设基于 Gemma-2 27B,因为 Gemma-3 尚未公开发布*) 基础模型进行微调的大型语言模型。该模型专门针对**思维链 (Chain-of-Thought, CoT) 推理**和**代码生成**任务进行了优化。名称中的 "Distill" 暗示在微调过程中可能采用了知识蒸馏技术,或许利用了更强大的教师模型的输出来增强其推理和编码能力。
138
+
139
+ * **基础模型:** google/gemma-2-27b (或指定使用的确切变体,例如 gemma-2-27b-it)
140
+ * **微调者:** [您的姓名/组织名称]
141
+ * **专注领域:** 思维链 (CoT), 代码生成, 代码解释, 代码调试
142
+ * **语言:** 主要使用英文进行提示和推理,可生成多种编程语言的代码。
143
+
144
+ ### 主要特性
145
+
146
+ * **增强的 CoT 推理能力:** 经过专门训练,能够在提供最终答案之前将复杂问题分解为中间步骤,这对于复杂的编码或算法任务特别有用。
147
+ * **强大的编码能力:** 能生成、解释、调试和翻译多种编程语言(如 Python, JavaScript, Java, C++, SQL 等)的代码。
148
+ * **基于 Gemma-2:** 构建于 Google 强大且高效的 Gemma-2 27B 模型架构之上。
149
+ * **蒸馏增强 (推测):** 可能受益于知识蒸馏,相对于在目标任务上的标准微调,性能有所提升。
150
+
151
+ ### 如何使用
152
+
153
+ 您可以通过 Hugging Face `transformers` 库来使用此模型。
154
+
155
+ ```python
156
+ import torch
157
+ from transformers import AutoTokenizer, AutoModelForCausalLM
158
+
159
+ # 指定您的微调模型的路径 (本地路径或 Hugging Face Hub ID)
160
+ model_id = "[您的 Hugging Face 模型 ID 或本地路径, 例如: YourUsername/Hicoder-R1-Distill-Gemma-27B]"
161
+
162
+ # 加载分词器和模型
163
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
164
+ model = AutoModelForCausalLM.from_pretrained(
165
+ model_id,
166
+ torch_dtype=torch.bfloat16, # 如果硬件支持,使用 bfloat16 以提高效率
167
+ device_map="auto" # 自动将模型分配到可用的 GPU 上
168
+ )
169
+
170
+ # --- 示例 1: 简单代码生成 ---
171
+ prompt_simple = "编写一个 Python 函数来计算一个数的阶乘。"
172
+ # 注意: 如果基础模型需要,请使用相应的聊天模板 (例如 Gemma-2 instruct)
173
+ # 使用 Gemma-2 instruct 模板结构的示例 (如果需要请调整):
174
+ messages_simple = [
175
+ {"role": "user", "content": prompt_simple}
176
+ ]
177
+ input_ids_simple = tokenizer.apply_chat_template(messages_simple, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
178
+
179
+ outputs_simple = model.generate(
180
+ input_ids_simple,
181
+ max_new_tokens=150,
182
+ do_sample=True,
183
+ temperature=0.7,
184
+ top_k=50,
185
+ top_p=0.95
186
+ )
187
+ response_simple = tokenizer.decode(outputs_simple[0][input_ids_simple.shape[1]:], skip_special_tokens=True)
188
+ print("--- 简单代码生成 ---")
189
+ print(response_simple)
190
+
191
+ # --- 示例 2: 带 CoT 的代码生成 ---
192
+ prompt_cot = """请逐步思考如何编写一个 Python 函数,使用埃拉托斯特尼筛法 (Sieve of Eratosthenes) 找出小于等于给定整数 'n' 的所有素数。然后,提供该函数。
193
+
194
+ 让我们分解一下步骤:
195
+ 1. 理解埃拉托斯特尼筛法的原理。
196
+ 2. 概述函数中需要的步骤。
197
+ 3. 基于概述编写 Python 代码。"""
198
+
199
+ messages_cot = [
200
+ {"role": "user", "content": prompt_cot}
201
+ ]
202
+ input_ids_cot = tokenizer.apply_chat_template(messages_cot, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
203
+
204
+ outputs_cot = model.generate(
205
+ input_ids_cot,
206
+ max_new_tokens=500, # 为 CoT + 代码允许更多 token
207
+ do_sample=True,
208
+ temperature=0.6,
209
+ top_k=50,
210
+ top_p=0.95
211
+ )
212
+ response_cot = tokenizer.decode(outputs_cot[0][input_ids_cot.shape[1]:], skip_special_tokens=True)
213
+ print("\n--- 带 CoT 的代码生成 ---")
214
+ print(response_cot)
215
+
216
+ ```
217
+
218
+ **提示词技�� (Prompting):** 为了获得最佳效果,尤其是在需要 CoT 推理时,请明确要求模型“逐步思考”或“在代码前提供你的推理过程”。除非您在微调过程中定义了自定义模板,否则请使用与基础 Gemma-2 模型关联的聊天模板。
219
+
220
+ ### 局限性与偏见
221
+
222
+ * 该模型基于 Gemma-2,继承了其能力和局限性。
223
+ * 尽管针对编码进行了微调,它仍可能生成不正确、低效或不安全的代码。**请务必仔细审查和测试生成的代码。**
224
+ * 模型的知识仅限于其训练数据的截止日期。
225
+ * 与所有大型语言模型一样,它可能表现出基础训练数据中存在的偏见。
226
+ * 思维链推理可能并非总是完美或符合逻辑。
227
+
228
+ ### 许可证 (License)
229
+
230
+ 该模型的许可证取决于基础 Gemma-2 模型的许可证以及您可能施加的任何附加条款。Gemma-2 模型通常受 "Gemma 使用条款" 的约束。请查阅模型附带的具体许可证文件或 Gemma 使用条款。
231
+
232
+ * **Gemma 使用条款:** [指向 Google Gemma 条款的链接, 例如: https://ai.google.dev/gemma/terms]
233
+ * **微调特定许可证 (如有):** [在此说明您是否添加了 Apache 2.0, MIT 等许可证,或声明其遵循基础模型的许可证]
234
+
235
+ ### 引用
236
+
237
+ 如果您在研究或工作中使用此模型,请考虑引用:
238
+
239
+ ```bibtex
240
+ @misc{hicoder_r1_distill_gemma_27b_[年份],
241
+ title={Hicoder-R1-Distill-Gemma-27B: 一个专注于思维链和代码生成的模型},
242
+ author={[您的姓名/组织名称]},
243
+ year={[发布年份]},
244
+ howpublished={\url{[模型 Hub 或仓库的链接]}}
245
+ }
246
+
247
+ @misc{gemma2_2024,
248
+ title={Gemma 2 Technical Report},
249
+ author={Gemma Team, Google},
250
+ year={2024},
251
+
252
+ }
253
+