File size: 11,398 Bytes
9cea743 76780a9 a41b938 76780a9 55be5e1 76780a9 55be5e1 76780a9 55be5e1 76780a9 55be5e1 76780a9 12fd932 76780a9 55be5e1 76780a9 55be5e1 76780a9 55be5e1 76780a9 f19366e 76780a9 55be5e1 76780a9 55be5e1 76780a9 55be5e1 76780a9 9cea743 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 |
---
datasets:
- nvidia/OpenCodeReasoning
pipeline_tag: image-text-to-text
---
# Hicoder-R1-Distill-Gemma-27B
Notably, this CoT-enabled model was trained using only a single RTX 4090D, achieved through optimizations in both GPU VRAM and system RAM management, as well as specific techniques applied during the training steps.
### Model Overview
**Hicoder-R1-Distill-Gemma-27B** is a large language model fine-tuned from Google's **Gemma-3 27B** base model. This model is specifically optimized for **Chain-of-Thought (CoT) reasoning** and **code generation** tasks.
* **Base Model:** google/gemma-3-27b
* **Fine-tuned by:** tonyli8623
* **Focus Areas:** Chain-of-Thought (CoT), Code Generation, Code Explanation, Debugging
* **Language:** Primarily English for prompts and reasoning, generates code in multiple languages.
### Key Features
* **Enhanced CoT Reasoning:** Explicitly trained to break down complex problems into intermediate steps before providing a final answer, particularly useful for complex coding or algorithmic tasks.
* **Strong Coding Capabilities:** Generates, explains, debugs, and translates code across various programming languages (e.g., Python, JavaScript, Java, C++, SQL, etc.).
* **Gemma-3 Foundation:** Built upon the powerful and efficient architecture of Google's Gemma-3 27B model.
* **Distillation Enhanced (Implied):** Potentially benefits from knowledge distillation for improved performance relative to standard fine-tuning on the target tasks.
### How to Use
You can use this model with the Hugging Face `transformers` library.
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
# Specify the path to your fine-tuned model (local or Hugging Face Hub ID)
model_id = "tonyli8623/Hicoder-R1-Distill-Gemma-27B"
# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16, # Use bfloat16 for efficiency if supported
device_map="auto" # Automatically distribute across available GPUs
)
# --- Example 1: Simple Code Generation ---
prompt_simple = "Write a Python function to calculate the factorial of a number."
# Note: Use the appropriate chat template if the base model requires it (e.g., Gemma-2 instruct)
# Example using Gemma-2 instruct template structure (adjust if needed):
messages_simple = [
{"role": "user", "content": prompt_simple}
]
input_ids_simple = tokenizer.apply_chat_template(messages_simple, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
outputs_simple = model.generate(
input_ids_simple,
max_new_tokens=150,
do_sample=True,
temperature=0.7,
top_k=50,
top_p=0.95
)
response_simple = tokenizer.decode(outputs_simple[0][input_ids_simple.shape[1]:], skip_special_tokens=True)
print("--- Simple Code Generation ---")
print(response_simple)
# --- Example 2: Code Generation with CoT ---
prompt_cot = """Think step-by-step to write a Python function that finds all prime numbers up to a given integer 'n' using the Sieve of Eratosthenes algorithm. Then, provide the function.
Let's break this down:
1. Understand the Sieve of Eratosthenes.
2. Outline the steps needed in the function.
3. Write the Python code based on the outline."""
messages_cot = [
{"role": "user", "content": prompt_cot}
]
input_ids_cot = tokenizer.apply_chat_template(messages_cot, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
outputs_cot = model.generate(
input_ids_cot,
max_new_tokens=500, # Allow more tokens for CoT + code
do_sample=True,
temperature=0.6,
top_k=50,
top_p=0.95
)
response_cot = tokenizer.decode(outputs_cot[0][input_ids_cot.shape[1]:], skip_special_tokens=True)
print("\n--- Code Generation with CoT ---")
print(response_cot)
```
**Prompting:** For best results, especially when seeking CoT reasoning, explicitly ask the model to "think step-by-step" or "provide your reasoning process before the code". in system prompts add "You are a code engineer proficient in various programming languages. Before answering, please carefully consider the question and create a logically coherent thought process, starting with and ending with . After thinking, provide the answer."
### Limitations and Bias
* This model is based on Gemma-3, and inherits its capabilities and limitations.
* While fine-tuned for coding, it may still generate incorrect, inefficient, or insecure code. **Always review and test generated code thoroughly.**
* The model's knowledge is limited to its training data cutoff.
* Like all LLMs, it may exhibit biases present in the underlying training data.
* Chain-of-Thought reasoning may not always be perfect or logical.
### License
The license for this model depends on the base Gemma-2 model's license and any additional terms you impose. The Gemma-3 models are typically governed by the "Gemma Terms of Use". Please consult the specific license file included with the model or the Gemma Terms of Use.
* **Gemma Terms of Use:** [Link to Google's Gemma Terms, e.g., https://ai.google.dev/gemma/terms]
* **Fine-tuning Specific License (if any):** [Specify if you add Apache 2.0, MIT, etc., or state it follows the base model license]
### Citation
If you use this model in your research or work, please consider citing:
```bibtex
@misc{hicoder_r1_distill_gemma_27b_[year],
title={Hicoder-R1-Distill-Gemma-27B: A Chain-of-Thought and Code Generation Focused Model},
author={[Your Name/Organization]},
year={[Year of Release]},
howpublished={\url{[Link to Model Hub or Repository]}}
}
@misc{gemma2_2024,
title={Gemma 3 Technical Report},
author={Gemma Team, Google},
year={2024},
howpublished={\url{https://ai.google.dev/gemma}} % Replace with actual Gemma 2 paper/report link if available
}
```
### Contact
For questions, feedback, or issues, please contact tonyli288@gmail.com.
---
---
## 中文版 (Chinese Version)
### 模型概述
**Hicoder-R1-Distill-Gemma-27B** 是一个基于 Google **Gemma-3 27B** (基础模型进行微调的大型语言模型。该模型专门针对**思维链 (Chain-of-Thought, CoT) 推理**和**代码生成**任务进行了优化。
* **基础模型:** google/gemma-2-27b (或指定使用的确切变体,例如 gemma-2-27b-it)
* **微调者:** [您的姓名/组织名称]
* **专注领域:** 思维链 (CoT), 代码生成, 代码解释, 代码调试
* **语言:** 主要使用英文进行提示和推理,可生成多种编程语言的代码。
### 主要特性
* **增强的 CoT 推理能力:** 经过专门训练,能够在提供最终答案之前将复杂问题分解为中间步骤,这对于复杂的编码或算法任务特别有用。
* **强大的编码能力:** 能生成、解释、调试和翻译多种编程语言(如 Python, JavaScript, Java, C++, SQL 等)的代码。
* **基于 Gemma-2:** 构建于 Google 强大且高效的 Gemma-2 27B 模型架构之上。
* **蒸馏增强 (推测):** 可能受益于知识蒸馏,相对于在目标任务上的标准微调,性能有所提升。
### 如何使用
您可以通过 Hugging Face `transformers` 库来使用此模型。
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
# 指定您的微调模型的路径 (本地路径或 Hugging Face Hub ID)
model_id = "tonyli8623/Hicoder-R1-Distill-Gemma-27B"
# 加载分词器和模型
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16, # 如果硬件支持,使用 bfloat16 以提高效率
device_map="auto" # 自动将模型分配到可用的 GPU 上
)
# --- 示例 1: 简单代码生成 ---
prompt_simple = "编写一个 Python 函数来计算一个数的阶乘。"
# 注意: 如果基础模型需要,请使用相应的聊天模板 (例如 Gemma-2 instruct)
# 使用 Gemma-2 instruct 模板结构的示例 (如果需要请调整):
messages_simple = [
{"role": "user", "content": prompt_simple}
]
input_ids_simple = tokenizer.apply_chat_template(messages_simple, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
outputs_simple = model.generate(
input_ids_simple,
max_new_tokens=150,
do_sample=True,
temperature=0.7,
top_k=50,
top_p=0.95
)
response_simple = tokenizer.decode(outputs_simple[0][input_ids_simple.shape[1]:], skip_special_tokens=True)
print("--- 简单代码生成 ---")
print(response_simple)
# --- 示例 2: 带 CoT 的代码生成 ---
prompt_cot = """请逐步思考如何编写一个 Python 函数,使用埃拉托斯特尼筛法 (Sieve of Eratosthenes) 找出小于等于给定整数 'n' 的所有素数。然后,提供该函数。
让我们分解一下步骤:
1. 理解埃拉托斯特尼筛法的原理。
2. 概述函数中需要的步骤。
3. 基于概述编写 Python 代码。"""
messages_cot = [
{"role": "user", "content": prompt_cot}
]
input_ids_cot = tokenizer.apply_chat_template(messages_cot, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
outputs_cot = model.generate(
input_ids_cot,
max_new_tokens=500, # 为 CoT + 代码允许更多 token
do_sample=True,
temperature=0.6,
top_k=50,
top_p=0.95
)
response_cot = tokenizer.decode(outputs_cot[0][input_ids_cot.shape[1]:], skip_special_tokens=True)
print("\n--- 带 CoT 的代码生成 ---")
print(response_cot)
```
**提示词技巧 (Prompting):** 为了获得最佳效果,尤其是在需要 CoT 推理时,请明确要求模型“逐步思考”或“在代码前提供你的推理过程”。如添加system prompts "你是一位精通各种编程语言的代码工程师。在回答之前,请仔细思考问题,并创建一个逻辑连贯的思考过程,以<think>开始,以</think>结束,思考完后给出答案。"
### 局限性与偏见
* 该模型基于 Gemma-2,继承了其能力和局限性。
* 尽管针对编码进行了微调,它仍可能生成不正确、低效或不安全的代码。**请务必仔细审查和测试生成的代码。**
* 模型的知识仅限于其训练数据的截止日期。
* 与所有大型语言模型一样,它可能表现出基础训练数据中存在的偏见。
* 思维链推理可能并非总是完美或符合逻辑。
### 许可证 (License)
该模型的许可证取决于基础 Gemma-2 模型的许可证以及您可能施加的任何附加条款。Gemma-2 模型通常受 "Gemma 使用条款" 的约束。请查阅模型附带的具体许可证文件或 Gemma 使用条款。
* **Gemma 使用条款:** [指向 Google Gemma 条款的链接, 例如: https://ai.google.dev/gemma/terms]
* **微调特定许可证 (如有):** [在此说明您是否添加了 Apache 2.0, MIT 等许可证,或声明其遵循基础模型的许可证]
### 引用
如果您在研究或工作中使用此模型,请考虑引用:
```bibtex
@misc{hicoder_r1_distill_gemma_27b_[年份],
title={Hicoder-R1-Distill-Gemma-27B: 一个专注于思维链和代码生成的模型},
author={[您的姓名/组织名称]},
year={[发布年份]},
howpublished={\url{[模型 Hub 或仓库的链接]}}
}
@misc{gemma2_2024,
title={Gemma 2 Technical Report},
author={Gemma Team, Google},
year={2024},
} |