SamirXR commited on
Commit
697d1ca
·
verified ·
1 Parent(s): e161753

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +216 -0
README.md ADDED
@@ -0,0 +1,216 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - iamtarun/python_code_instructions_18k_alpaca
5
+ base_model:
6
+ - Qwen/Qwen2.5-0.5B-Instruct
7
+ tags:
8
+ - code
9
+ - python
10
+ - text-generation
11
+ - coding
12
+ - yzy
13
+ - code-generation
14
+ ---
15
+ # yzy-python-0.5b 🐍
16
+ Lightweight Python-focused language model (0.5B parameters) fine-tuned for code generation and instruction-following.
17
+
18
+ Optimized for:
19
+ - Python code generation
20
+ - scripting help
21
+ - small coding copilots
22
+ - local inference
23
+ - experimentation
24
+ - hackathons
25
+
26
+ Base model: Qwen2-0.5B-Instruct
27
+ Fine-tuning method: QLoRA (4-bit)
28
+ Dataset style: Alpaca-format Python instructions
29
+
30
+ ---
31
+
32
+ # Demo
33
+
34
+ ## Transformers usage
35
+
36
+ ```python
37
+ from transformers import AutoTokenizer, AutoModelForCausalLM
38
+
39
+ model_id = "SamirXR/yzy-python-0.5b"
40
+
41
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
42
+
43
+ model = AutoModelForCausalLM.from_pretrained(
44
+ model_id,
45
+ device_map="auto"
46
+ )
47
+
48
+ prompt = "Write a Python function to reverse a string"
49
+
50
+ inputs = tokenizer(prompt, return_tensors="pt")
51
+
52
+ outputs = model.generate(
53
+ **inputs,
54
+ max_new_tokens=200
55
+ )
56
+
57
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
58
+ ```
59
+
60
+ ---
61
+
62
+ ## 4-bit inference (recommended)
63
+
64
+ ```python
65
+ import torch
66
+ from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
67
+
68
+ model_id = "SamirXR/yzy-python-0.5b"
69
+
70
+ bnb_config = BitsAndBytesConfig(
71
+ load_in_4bit=True,
72
+ bnb_4bit_quant_type="nf4",
73
+ bnb_4bit_compute_dtype=torch.float16,
74
+ bnb_4bit_use_double_quant=True,
75
+ )
76
+
77
+ model = AutoModelForCausalLM.from_pretrained(
78
+ model_id,
79
+ quantization_config=bnb_config,
80
+ device_map="auto",
81
+ trust_remote_code=True
82
+ )
83
+
84
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
85
+ tokenizer.pad_token = tokenizer.eos_token
86
+
87
+ prompt = "Write a Python function for fibonacci numbers"
88
+
89
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
90
+
91
+ outputs = model.generate(
92
+ **inputs,
93
+ max_new_tokens=200,
94
+ temperature=0.7,
95
+ top_p=0.9
96
+ )
97
+
98
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
99
+ ```
100
+
101
+ ---
102
+
103
+ ## Gradio Chatbot Demo
104
+
105
+ ```python
106
+ import torch
107
+ import gradio as gr
108
+ from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
109
+
110
+ MODEL_NAME = "SamirXR/yzy-python-0.5b"
111
+
112
+ bnb_config = BitsAndBytesConfig(
113
+ load_in_4bit=True,
114
+ bnb_4bit_quant_type="nf4",
115
+ bnb_4bit_compute_dtype=torch.float16,
116
+ bnb_4bit_use_double_quant=True,
117
+ )
118
+
119
+ model = AutoModelForCausalLM.from_pretrained(
120
+ MODEL_NAME,
121
+ quantization_config=bnb_config,
122
+ device_map="auto",
123
+ trust_remote_code=True
124
+ )
125
+
126
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
127
+ tokenizer.pad_token = tokenizer.eos_token
128
+
129
+ def generate_code(instruction, history):
130
+ prompt = f"### Instruction:\n{instruction}\n\n### Response:\n"
131
+
132
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
133
+
134
+ with torch.no_grad():
135
+ outputs = model.generate(
136
+ **inputs,
137
+ max_new_tokens=256,
138
+ do_sample=True,
139
+ temperature=0.7,
140
+ top_p=0.9,
141
+ repetition_penalty=1.1,
142
+ pad_token_id=tokenizer.eos_token_id
143
+ )
144
+
145
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
146
+ response = response.split("### Response:\n")[-1].strip()
147
+
148
+ return response
149
+
150
+ demo = gr.ChatInterface(
151
+ fn=generate_code,
152
+ title="yzy-python-0.5b Chatbot",
153
+ description="Python coding assistant (QLoRA fine-tuned Qwen2-0.5B)",
154
+ examples=[
155
+ "Write a function to calculate fibonacci numbers",
156
+ "Create a Python class for a linked list",
157
+ "Reverse a string in Python"
158
+ ],
159
+ )
160
+
161
+ demo.launch(share=True)
162
+ ```
163
+
164
+ ---
165
+
166
+ # Training Details
167
+
168
+ Base model:
169
+ Qwen/Qwen2-0.5B-Instruct
170
+
171
+ Dataset:
172
+ iamtarun/python_code_instructions_18k_alpaca
173
+
174
+ Format used during training:
175
+
176
+ ```
177
+ ### Instruction:
178
+ <task>
179
+
180
+ ### Response:
181
+ <answer>
182
+ ```
183
+
184
+ Training method:
185
+ QLoRA (4-bit NF4 quantization)
186
+
187
+ Key parameters:
188
+ - LoRA rank: 8
189
+ - alpha: 16
190
+ - dropout: 0.05
191
+ - epochs: 2
192
+ - learning rate: 2e-4
193
+ - context length: 512
194
+ - optimizer: paged_adamw_8bit
195
+
196
+ ---
197
+
198
+ # Citation
199
+
200
+ If you use this model, please cite:
201
+
202
+ Base model:
203
+ Qwen2 Technical Report (Qwen Team, 2024)
204
+
205
+ Dataset:
206
+ python_code_instructions_18k_alpaca (iamtarun)
207
+
208
+ Model:
209
+ yzy-python-0.5b (SamirXR)
210
+
211
+ ---
212
+
213
+ # Notes
214
+
215
+ This is a small model intended for experimentation and lightweight coding assistance.
216
+ Performance will not match large models but allows fast local inference with minimal resources.