Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,300 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
language:
|
| 4 |
+
- en
|
| 5 |
+
- ko
|
| 6 |
+
- code
|
| 7 |
+
library_name: transformers
|
| 8 |
+
tags:
|
| 9 |
+
- code-llama
|
| 10 |
+
- code-review
|
| 11 |
+
- fine-tuning
|
| 12 |
+
- SFT
|
| 13 |
+
- LoRA
|
| 14 |
+
pipeline_tag: text-generation
|
| 15 |
+
base_model:
|
| 16 |
+
- codellama/CodeLlama-7b-hf
|
| 17 |
+
---
|
| 18 |
+
|
| 19 |
+
# Model Card for codellama-7b-code-review
|
| 20 |
+
|
| 21 |
+
μ΄ λͺ¨λΈ μΉ΄λλ `codellama-7b-code-review` λͺ¨λΈμ λν μ 보λ₯Ό λ΄κ³ μμ΅λλ€.
|
| 22 |
+
|
| 23 |
+
## Model Details / λͺ¨λΈ μμΈ μ 보
|
| 24 |
+
|
| 25 |
+
<details>
|
| 26 |
+
<summary><strong>English</strong></summary>
|
| 27 |
+
|
| 28 |
+
This model is fine-tuned from Meta's `codellama/CodeLlama-7b-hf` to review and provide feedback on code changes (`diffs`) from GitHub Pull Requests. It has been primarily trained on JavaScript and React code reviews, aiming to generate constructive feedback from a senior engineer's perspective on topics like code quality, architecture, performance, and conventions.
|
| 29 |
+
|
| 30 |
+
- **Developed by:** [ken123777](https://huggingface.co/ken123777)
|
| 31 |
+
- **Model type:** Causal Language Model
|
| 32 |
+
- **Language(s):** English, Korean, Diff format
|
| 33 |
+
- **License:** apache-2.0
|
| 34 |
+
- **Finetuned from model:** `codellama/CodeLlama-7b-hf`
|
| 35 |
+
|
| 36 |
+
</details>
|
| 37 |
+
|
| 38 |
+
<details>
|
| 39 |
+
<summary><strong>νκ΅μ΄</strong></summary>
|
| 40 |
+
|
| 41 |
+
μ΄ λͺ¨λΈμ Metaμ `codellama/CodeLlama-7b-hf` λͺ¨λΈμ κΈ°λ°μΌλ‘, GitHub Pull Requestμ μ½λ λ³κ²½μ¬ν(`diff`)μ 리뷰νκ³ νΌλλ°±μ μ 곡νλλ‘ νμΈνλλμμ΅λλ€. μ£Όλ‘ JavaScriptμ React μ½λ 리뷰μ μ€μ μ λκ³ νμ΅λμμΌλ©°, μλμ΄ μμ§λμ΄μ κ΄μ μμ μ½λ νμ§, μν€ν
μ², μ±λ₯, 컨벀μ
λ±μ λν 건μ€μ μΈ νΌλλ°±μ μμ±νλ κ²μ λͺ©νλ‘ ν©λλ€.
|
| 42 |
+
|
| 43 |
+
- **κ°λ°μ:** [ken123777](https://huggingface.co/ken123777)
|
| 44 |
+
- **λͺ¨λΈ μ’
λ₯:** μΈκ³Ό κ΄κ³ μΈμ΄ λͺ¨λΈ (Causal Language Model)
|
| 45 |
+
- **μΈμ΄:** μμ΄, νκ΅μ΄, Diff νμ
|
| 46 |
+
- **λΌμ΄μ μ€:** apache-2.0
|
| 47 |
+
- **νμΈνλ κΈ°λ° λͺ¨λΈ:** `codellama/CodeLlama-7b-hf`
|
| 48 |
+
|
| 49 |
+
</details>
|
| 50 |
+
|
| 51 |
+
### Model Sources / λͺ¨λΈ μμ€
|
| 52 |
+
|
| 53 |
+
- **Repository:** [https://huggingface.co/ken12377/codellama-7b-code-review](https://huggingface.co/ken12377/codellama-7b-code-review)
|
| 54 |
+
|
| 55 |
+
## Uses / μ¬μ© μ 보
|
| 56 |
+
|
| 57 |
+
<details>
|
| 58 |
+
<summary><strong>English</strong></summary>
|
| 59 |
+
|
| 60 |
+
### Direct Use
|
| 61 |
+
|
| 62 |
+
This model can be used directly for code review automation. By providing code changes in `diff` format as input, the model will generate review comments.
|
| 63 |
+
|
| 64 |
+
**Warning:** The content generated by the model always requires review. The final decision must be made by a human developer.
|
| 65 |
+
|
| 66 |
+
### Downstream Use
|
| 67 |
+
|
| 68 |
+
This model can be reused as a base for further fine-tuning on specific project's internal coding conventions or more specialized review criteria.
|
| 69 |
+
|
| 70 |
+
### Out-of-Scope Use
|
| 71 |
+
|
| 72 |
+
This model is specialized for code review tasks. It may not perform well for other purposes such as general-purpose chatbots, code generation, or translation. Especially, inputting code that is not in `diff` format may lead to unexpected results.
|
| 73 |
+
|
| 74 |
+
</details>
|
| 75 |
+
|
| 76 |
+
<details>
|
| 77 |
+
<summary><strong>νκ΅μ΄</strong></summary>
|
| 78 |
+
|
| 79 |
+
### μ§μ μ¬μ©
|
| 80 |
+
|
| 81 |
+
μ΄ λͺ¨λΈμ μ½λ 리뷰 μλνμ μ§μ μ¬μ©λ μ μμ΅λλ€. `diff` νμμ μ½λ λ³κ²½μ¬νμ μ
λ ₯μΌλ‘ μ 곡νλ©΄, λͺ¨λΈμ ν΄λΉ μ½λμ λν 리뷰 μ½λ©νΈλ₯Ό μμ±ν©λλ€.
|
| 82 |
+
|
| 83 |
+
**κ²½κ³ **: λͺ¨λΈμ΄ μμ±νλ λ΄μ©μ νμ κ²ν κ° νμνλ©°, μ΅μ’
κ²°μ μ κ°λ°μκ° μ§μ λ΄λ €μΌ ν©λλ€.
|
| 84 |
+
|
| 85 |
+
### λ€μ΄μ€νΈλ¦Ό μ¬μ©
|
| 86 |
+
|
| 87 |
+
μ΄ λͺ¨λΈμ νΉμ νλ‘μ νΈμ λ΄λΆ μ½λ© 컨벀μ
μ΄λ λ μ λ¬Ένλ 리뷰 κΈ°μ€μ νμ΅μν€κΈ° μν κΈ°λ° λͺ¨λΈλ‘ μ¬μ¬μ©λ μ μμ΅λλ€.
|
| 88 |
+
|
| 89 |
+
### μ¬μ© λ²μ μΈ
|
| 90 |
+
|
| 91 |
+
μ΄ λͺ¨λΈμ μ½λ 리뷰 νμ€ν¬μ νΉνλμ΄ μμΌλ―λ‘, μΌλ°μ μΈ μ±λ΄ λνλ μ½λ μμ±, λ²μ λ±μ λ€λ₯Έ λͺ©μ μΌλ‘λ μ’μ μ±λ₯μ 보μ΄μ§ μμ μ μμ΅λλ€. νΉν `diff` νμμ΄ μλ μ½λλ₯Ό μ
λ ₯νλ©΄ μμμΉ λͺ»ν κ²°κ³Όκ° λμ¬ μ μμ΅λλ€.
|
| 92 |
+
|
| 93 |
+
</details>
|
| 94 |
+
|
| 95 |
+
## Bias, Risks, and Limitations / νΈν₯, μν λ° νκ³
|
| 96 |
+
|
| 97 |
+
<details>
|
| 98 |
+
<summary><strong>English</strong></summary>
|
| 99 |
+
|
| 100 |
+
- **Data Bias:** The model was trained on public GitHub Pull Request data, so it may be biased towards specific coding styles or patterns present in that data.
|
| 101 |
+
- **Inaccuracy (Hallucination):** The model may occasionally generate feedback that is factually incorrect or out of context. The generated reviews always need verification.
|
| 102 |
+
- **Limited Knowledge:** The model's knowledge is limited to the data at the time of fine-tuning and may not reflect the latest library or framework updates.
|
| 103 |
+
</details>
|
| 104 |
+
|
| 105 |
+
<details>
|
| 106 |
+
<summary><strong>νκ΅μ΄</strong></summary>
|
| 107 |
+
|
| 108 |
+
- **λ°μ΄ν° νΈν₯:** λͺ¨λΈμ 곡κ°λ GitHub Pull Request λ°μ΄ν°λ₯Ό κΈ°λ°μΌλ‘ νμ΅λμμΌλ―λ‘, ν΄λΉ λ°μ΄ν°μ μ‘΄μ¬νλ νΉμ μ½λ© μ€νμΌμ΄λ ν¨ν΄μ νΈν₯λμ΄ μμ μ μμ΅λλ€.
|
| 109 |
+
- **λΆμ νμ±(νκ°):** λͺ¨λΈμ λλλ‘ μ¬μ€κ³Ό λ€λ₯΄κ±°λ λ¬Έλ§₯μ λ§μ§ μλ νΌλλ°±μ μμ±ν μ μμ΅λλ€. μμ±λ 리뷰λ νμ κ²μ¦μ΄ νμν©λλ€.
|
| 110 |
+
- **μ νλ μ§μ:** λͺ¨λΈμ μ§μμ νμΈνλ μμ μ λ°μ΄ν°λ‘ νμ λμ΄ μμΌλ©°, μ΅μ λΌμ΄λΈλ¬λ¦¬λ νλ μμν¬ λ³κ²½μ¬νμ λ°μνμ§ λͺ»ν μ μμ΅λλ€.
|
| 111 |
+
</details>
|
| 112 |
+
|
| 113 |
+
### Recommendations / κΆμ₯ μ¬ν
|
| 114 |
+
|
| 115 |
+
<details>
|
| 116 |
+
<summary><strong>English</strong></summary>
|
| 117 |
+
Users should treat the code reviews generated by the model as a 'draft' or 'assistive tool' to help the development process, not as a final judgment. It is recommended that a human expert reviews critical changes.
|
| 118 |
+
</details>
|
| 119 |
+
|
| 120 |
+
<details>
|
| 121 |
+
<summary><strong>νκ΅μ΄</strong></summary>
|
| 122 |
+
μ¬μ©μλ λͺ¨λΈμ΄ μμ±ν μ½λ 리뷰λ₯Ό μ΅μ’
μ μΈ νλ¨μ΄ μλ, κ°λ° κ³Όμ μ λλ 'μ΄μ' λλ '보쑰 λꡬ'λ‘ νμ©ν΄μΌ ν©λλ€. μ€μν λ³κ²½μ¬νμ λν΄μλ λ°λμ μΈκ° μ λ¬Έκ°μ κ²ν λ₯Ό κ±°μΉλ κ²μ κΆμ₯ν©λλ€.
|
| 123 |
+
</details>
|
| 124 |
+
|
| 125 |
+
## How to Get Started with the Model / λͺ¨λΈ μμνκΈ°
|
| 126 |
+
|
| 127 |
+
<details>
|
| 128 |
+
<summary><strong>English</strong></summary>
|
| 129 |
+
|
| 130 |
+
**Note:** This model may be available in two versions: **Adapter** and **Merged**. Use the appropriate code for your model type.
|
| 131 |
+
|
| 132 |
+
#### 1. Using the Adapter Model (`ken12377/codellama-7b-code-review-adapter`)
|
| 133 |
+
|
| 134 |
+
To use the adapter model, you must first load the base model and then apply the adapter using the `peft` library.
|
| 135 |
+
|
| 136 |
+
#### 2. Using the Merged Model (`ken12377/codellama-7b-code-review`)
|
| 137 |
+
|
| 138 |
+
If the model is fully merged with the base model, you can load it directly without `peft`.
|
| 139 |
+
|
| 140 |
+
</details>
|
| 141 |
+
|
| 142 |
+
<details>
|
| 143 |
+
<summary><strong>νκ΅μ΄</strong></summary>
|
| 144 |
+
|
| 145 |
+
**μ°Έκ³ :** μ΄ λͺ¨λΈμ **μ΄λν°(Adapter)** μ **λ³ν©λ(Merged)** λ κ°μ§ λ²μ μΌλ‘ μ 곡λ μ μμ΅λλ€. μμ μ λͺ¨λΈ νμ
μ λ§λ μ½λλ₯Ό μ¬μ©νμΈμ.
|
| 146 |
+
|
| 147 |
+
#### 1. μ΄λν° λͺ¨λΈ μ¬μ©λ² (`ken123777/codellama-7b-code-review-adapter`)
|
| 148 |
+
|
| 149 |
+
μ΄λν° λͺ¨λΈμ μ¬μ©νλ €λ©΄, κΈ°λ° λͺ¨λΈμ λ¨Όμ λ‘λν ν `peft` λΌμ΄λΈλ¬λ¦¬λ₯Ό μ¬μ©ν΄ μ΄λν°λ₯Ό μ μ©ν΄μΌ ν©λλ€.
|
| 150 |
+
|
| 151 |
+
#### 2. λ³ν©λ λͺ¨λΈ μ¬μ©λ² (`ken123777/codellama-7b-code-review`)
|
| 152 |
+
|
| 153 |
+
λͺ¨λΈμ΄ κΈ°λ° λͺ¨λΈκ³Ό μμ ν λ³ν©λ κ²½μ°, `peft` μμ΄ μ§μ λͺ¨λΈμ λ‘λνμ¬ μ¬μ©ν μ μμ΅λλ€.
|
| 154 |
+
|
| 155 |
+
</details>
|
| 156 |
+
|
| 157 |
+
````python
|
| 158 |
+
import torch
|
| 159 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 160 |
+
from peft import PeftModel
|
| 161 |
+
|
| 162 |
+
# --- Configuration (Choose one) ---
|
| 163 |
+
# 1. For Adapter Model
|
| 164 |
+
use_adapter = True
|
| 165 |
+
base_model_name = "codellama/CodeLlama-7b-hf"
|
| 166 |
+
adapter_or_model_name = "ken12377/codellama-7b-code-review-adapter"
|
| 167 |
+
|
| 168 |
+
# 2. For Merged Model
|
| 169 |
+
# use_adapter = False
|
| 170 |
+
# adapter_or_model_name = "ken12377/codellama-7b-code-review"
|
| 171 |
+
|
| 172 |
+
# --- Load Model and Tokenizer ---
|
| 173 |
+
if use_adapter:
|
| 174 |
+
base_model = AutoModelForCausalLM.from_pretrained(
|
| 175 |
+
base_model_name,
|
| 176 |
+
torch_dtype=torch.float16,
|
| 177 |
+
device_map="auto",
|
| 178 |
+
)
|
| 179 |
+
tokenizer = AutoTokenizer.from_pretrained(adapter_or_model_name)
|
| 180 |
+
model = PeftModel.from_pretrained(base_model, adapter_or_model_name)
|
| 181 |
+
else:
|
| 182 |
+
tokenizer = AutoTokenizer.from_pretrained(adapter_or_model_name)
|
| 183 |
+
model = AutoModelForCausalLM.from_pretrained(
|
| 184 |
+
adapter_or_model_name,
|
| 185 |
+
torch_dtype=torch.float16,
|
| 186 |
+
device_map="auto",
|
| 187 |
+
)
|
| 188 |
+
|
| 189 |
+
model.eval()
|
| 190 |
+
|
| 191 |
+
# --- Inference ---
|
| 192 |
+
diff_code = """
|
| 193 |
+
--- a/src/components/LoginForm.js
|
| 194 |
+
+++ b/src/components/LoginForm.js
|
| 195 |
+
-import React from 'react';
|
| 196 |
+
+import React, { useState } from 'react';
|
| 197 |
+
|
| 198 |
+
-const LoginForm = () => (
|
| 199 |
+
- <form>
|
| 200 |
+
- <label>Email: <input type="email" /></label>
|
| 201 |
+
- <br />
|
| 202 |
+
- <label>Password: <input type="password" /></label>
|
| 203 |
+
- <br />
|
| 204 |
+
- <button type="submit">Log In</button>
|
| 205 |
+
- </form>
|
| 206 |
+
-);
|
| 207 |
+
+const LoginForm = () => {
|
| 208 |
+
+ const [credentials, setCredentials] = useState({ email: '', password: '' });
|
| 209 |
+
+ /* ... (rest of the diff code) ... */
|
| 210 |
+
+};
|
| 211 |
+
|
| 212 |
+
export default LoginForm;
|
| 213 |
+
"""
|
| 214 |
+
|
| 215 |
+
# Prompt in Korean
|
| 216 |
+
prompt = f"""### μ§μ:
|
| 217 |
+
μ 곡λ μ½λλ pull requestμ diff λ΄μ©μ
λλ€. μ½λμ κ°μ ν μ μλ λΆλΆμ λν΄ μ΅μ 3κ°μ§ νλͺ©μΌλ‘ λλμ΄ μμΈνκ³ κ΅¬μ²΄μ μΈ νΌλλ°±μ μ 곡ν΄μ£ΌμΈμ.
|
| 218 |
+
|
| 219 |
+
### μ
λ ₯:
|
| 220 |
+
```diff
|
| 221 |
+
{diff_code}
|
| 222 |
+
````
|
| 223 |
+
|
| 224 |
+
### μλ΅:
|
| 225 |
+
|
| 226 |
+
1. """
|
| 227 |
+
|
| 228 |
+
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
|
| 229 |
+
outputs = model.generate(\*\*inputs, max_new_tokens=512, temperature=0.7, repetition_penalty=1.2)
|
| 230 |
+
response = tokenizer.decode(outputs[0]len(inputs.input_ids[0]):], skip_special_tokens=True)
|
| 231 |
+
|
| 232 |
+
print(response)
|
| 233 |
+
|
| 234 |
+
```
|
| 235 |
+
|
| 236 |
+
|
| 237 |
+
## Training Details / νμ΅ μμΈ μ 보
|
| 238 |
+
|
| 239 |
+
<details>
|
| 240 |
+
<summary><strong>English</strong></summary>
|
| 241 |
+
|
| 242 |
+
### Training Data
|
| 243 |
+
This model was fine-tuned using the `review_dataset.json` file, which contains public Pull Request data collected from GitHub. The dataset is structured in a `instruction`, `input`(diff), `output`(review comment) format.
|
| 244 |
+
|
| 245 |
+
### Training Procedure
|
| 246 |
+
The model was fine-tuned using the QLoRA technique. It utilized the `SFTTrainer` from the `trl` library, applying 4-bit quantization and LoRA (Low-Rank Adaptation) for efficient training.
|
| 247 |
+
|
| 248 |
+
#### Training Hyperparameters
|
| 249 |
+
- **model:** `codellama/CodeLlama-7b-hf`
|
| 250 |
+
- **max_seq_length:** 4096
|
| 251 |
+
- **lora_alpha:** 128
|
| 252 |
+
- **lora_dropout:** 0.1
|
| 253 |
+
- **lora_r:** 64
|
| 254 |
+
- **learning_rate:** 2e-4
|
| 255 |
+
- **optimizer:** paged_adamw_32bit
|
| 256 |
+
- **gradient_accumulation_steps:** 8
|
| 257 |
+
- **per_device_train_batch_size:** 2
|
| 258 |
+
- **max_steps:** 1900
|
| 259 |
+
|
| 260 |
+
</details>
|
| 261 |
+
|
| 262 |
+
<details>
|
| 263 |
+
<summary><strong>νκ΅μ΄</strong></summary>
|
| 264 |
+
|
| 265 |
+
### νμ΅ λ°μ΄ν°
|
| 266 |
+
μ΄ λͺ¨λΈμ GitHubμμ μμ§λ κ³΅κ° Pull Request λ°μ΄ν°λ₯Ό ν¬ν¨νλ `review_dataset.json` νμΌμ μ¬μ©νμ¬ νμΈνλλμμ΅λλ€. λ°μ΄ν°μ
μ `instruction`, `input`(diff), `output`(리뷰 μ½λ©νΈ) νμμΌλ‘ ꡬμ±λμ΄ μμ΅λλ€.
|
| 267 |
+
|
| 268 |
+
### νμ΅ μ μ°¨
|
| 269 |
+
λͺ¨λΈμ QLoRA κΈ°λ²μ μ¬μ©νμ¬ νμΈνλλμμ΅λλ€. `trl` λΌμ΄λΈλ¬λ¦¬μ `SFTTrainer`λ₯Ό μ¬μ©νμΌλ©°, 4-bit μμνμ LoRA(Low-Rank Adaptation)λ₯Ό μ μ©νμ¬ ν¨μ¨μ μΈ νμ΅μ μ§ννμ΅λλ€.
|
| 270 |
+
|
| 271 |
+
#### νμ΅ νμ΄νΌνλΌλ―Έν°
|
| 272 |
+
- **λͺ¨λΈ:** `codellama/CodeLlama-7b-hf`
|
| 273 |
+
- **μ΅λ μνμ€ κΈΈμ΄:** 4096
|
| 274 |
+
- **LoRA Alpha:** 128
|
| 275 |
+
- **LoRA Dropout:** 0.1
|
| 276 |
+
- **LoRA Rank (r):** 64
|
| 277 |
+
- **νμ΅λ₯ :** 2e-4
|
| 278 |
+
- **μ΅ν°λ§μ΄μ :** paged_adamw_32bit
|
| 279 |
+
- **Gradient Accumulation Steps:** 8
|
| 280 |
+
- **μ₯μΉλ³ νμ΅ λ°°μΉ ν¬κΈ°:** 2
|
| 281 |
+
- **μ΅λ μ€ν
μ:** 1900
|
| 282 |
+
|
| 283 |
+
</details>
|
| 284 |
+
|
| 285 |
+
## Compute Infrastructure / μ»΄ν¨ν
μΈνλΌ
|
| 286 |
+
|
| 287 |
+
<details>
|
| 288 |
+
<summary><strong>English</strong></summary>
|
| 289 |
+
|
| 290 |
+
- **Hardware Type:** RunPod Cloud GPU
|
| 291 |
+
- **Cloud Provider:** RunPod
|
| 292 |
+
</details>
|
| 293 |
+
|
| 294 |
+
<details>
|
| 295 |
+
<summary><strong>νκ΅μ΄</strong></summary>
|
| 296 |
+
|
| 297 |
+
- **νλμ¨μ΄ μ’
λ₯:** RunPod ν΄λΌμ°λ GPU
|
| 298 |
+
- **ν΄λΌμ°λ μ 곡μ
체:** RunPod
|
| 299 |
+
</details>
|
| 300 |
+
```
|