irfanalee commited on
Commit
6a03069
·
verified ·
1 Parent(s): 6616e84

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +55 -0
README.md ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: Qwen/Qwen2.5-Coder-7B-Instruct
4
+ tags:
5
+ - code-review
6
+ - python
7
+ - qwen2
8
+ - fine-tuned
9
+ datasets:
10
+ - custom
11
+ language:
12
+ - en
13
+ pipeline_tag: text-generation
14
+ ---
15
+
16
+ # Code Review Critic
17
+
18
+ A fine-tuned Qwen2.5-Coder-7B-Instruct model for Python code review.
19
+
20
+ ## Model Description
21
+
22
+ This model provides constructive, actionable feedback on Python code. It focuses on:
23
+ - Bug detection
24
+ - Potential issues
25
+ - Code quality improvements
26
+
27
+ **Base Model:** Qwen/Qwen2.5-Coder-7B-Instruct
28
+ **Fine-tuning Method:** QLoRA (4-bit quantization + LoRA adapters)
29
+ **Training Data:** 8,275 real GitHub PR review comments from major Python projects
30
+
31
+ ## Training Details
32
+
33
+ - **LoRA Rank:** 64
34
+ - **LoRA Alpha:** 64
35
+ - **Learning Rate:** 2e-4
36
+ - **Epochs:** 2
37
+ - **Final Eval Loss:** 0.8455
38
+
39
+ ## Usage
40
+
41
+ ```python
42
+ from transformers import AutoModelForCausalLM, AutoTokenizer
43
+
44
+ model = AutoModelForCausalLM.from_pretrained("YOUR_USERNAME/code-review-critic")
45
+ tokenizer = AutoTokenizer.from_pretrained("YOUR_USERNAME/code-review-critic")
46
+
47
+ messages = [
48
+ {"role": "system", "content": "You are an expert code reviewer. Analyze the provided Python code and give constructive, specific feedback."},
49
+ {"role": "user", "content": "Review this Python code:\n\n```python\ndef get_user(id):\n return db.query(f'SELECT * FROM users WHERE id = {id}')\n```"}
50
+ ]
51
+
52
+ text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
53
+ inputs = tokenizer(text, return_tensors="pt")
54
+ outputs = model.generate(**inputs, max_new_tokens=512)
55
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))