neuralbroker commited on
Commit
e89e956
Β·
verified Β·
1 Parent(s): 9ad0c4f

Upload MODEL_CARD.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. MODEL_CARD.md +275 -0
MODEL_CARD.md ADDED
@@ -0,0 +1,275 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ library_name: llama-cpp-python
5
+ pipeline_tag: text-generation
6
+ tags:
7
+ - code-generation
8
+ - coding-assistant
9
+ - gguf
10
+ - llama.cpp
11
+ - qwen2.5
12
+ - python
13
+ - javascript
14
+ - fine-tuned
15
+ base_model:
16
+ - Qwen/Qwen2.5-1.5B-Instruct
17
+ ---
18
+
19
+ # BlitzKode
20
+
21
+ **BlitzKode** is a fine-tuned AI coding assistant built by **Sajad** using the Qwen2.5-1.5B base model. It's packaged as a GGUF format model for fast local inference with llama.cpp.
22
+
23
+ > Created by [Abdulla Sajad](https://github.com/neuralbroker)
24
+ > Project: [neuralbroker/blitzkode](https://github.com/neuralbroker/blitzkode)
25
+
26
+ ---
27
+
28
+ ## Model Summary
29
+
30
+ | Property | Value |
31
+ |----------|-------|
32
+ | **Model Name** | BlitzKode |
33
+ | **Version** | 2.0 |
34
+ | **Base Model** | Qwen/Qwen2.5-1.5B-Instruct |
35
+ | **Model Format** | GGUF (F16, ~3GB) |
36
+ | **Primary Runtime** | llama.cpp / llama-cpp-python |
37
+ | **Artifact** | `blitzkode.gguf` |
38
+ | **Context Window** | 2048 tokens |
39
+ | **Creator** | Sajad |
40
+ | **License** | MIT (also see Qwen2.5 upstream license) |
41
+
42
+ ---
43
+
44
+ ## Architecture
45
+
46
+ - **Model Type**: Transformer-based LLM (1.5B parameters)
47
+ - **Architecture**: Qwen2
48
+ - **Quantization**: GGUF F16 (~3GB)
49
+ - **Vocabulary**: 151,936 tokens
50
+ - **Inference**: CPU/GPU with llama.cpp (configurable via BLITZKODE_GPU_LAYERS)
51
+
52
+ ---
53
+
54
+ ## Training Pipeline
55
+
56
+ BlitzKode was fine-tuned through a 4-stage pipeline:
57
+
58
+ ### 1. SFT (Supervised Fine-Tuning)
59
+ Applies LoRA fine-tuning to coding-style prompts and responses using PEFT library.
60
+
61
+ ### 2. Reward-based SFT continuation
62
+ Applies additional SFT with heuristic reward functions for code correctness, formatting, and reasoning. Note: this stage uses standard SFT training, not a full GRPO implementation.
63
+
64
+ ### 3. DPO (Direct Preference Optimization)
65
+ Trains on handcrafted preference pairs to improve clarity and answer quality.
66
+
67
+ ### 4. Merge & Export
68
+ Merges LoRA adapters into base model and converts to GGUF format.
69
+
70
+ ### Training Frameworks
71
+ - HuggingFace Transformers
72
+ - PEFT (LoRA)
73
+ - TRL (DPO/GRPO)
74
+ - llama.cpp (inference/export)
75
+
76
+ ---
77
+
78
+ ## Training Data
79
+
80
+ Custom curated coding datasets covering:
81
+ - Algorithm implementation
82
+ - Data structures
83
+ - Code explanations
84
+ - Programming concepts
85
+ - Bug fixing scenarios
86
+
87
+ ---
88
+
89
+ ## Features
90
+
91
+ - **Multi-language Code Generation** - Python, JavaScript, Java, C++, TypeScript, SQL
92
+ - **Code Explanation** - Clear comments and documentation
93
+ - **Bug Fixing** - Debug and fix code issues
94
+ - **Algorithm Assistance** - Data structures and algorithms
95
+ - **Offline Operation** - Runs locally without internet
96
+ - **Fast Inference** - Optimized CPU inference
97
+ - **Modern UI** - Professional dark interface
98
+
99
+ ---
100
+
101
+ ## Intended Use
102
+
103
+ ### Best For
104
+ - Local offline coding assistance
105
+ - Algorithm and data structure help
106
+ - Code generation and explanation
107
+ - Educational programming support
108
+ - Code review and debugging
109
+
110
+ ### Out of Scope
111
+ - Production code without expert review
112
+ - Security-critical applications
113
+ - Multi-modal tasks (images not supported)
114
+ - Long-context repository analysis
115
+
116
+ ---
117
+
118
+ ## API & Usage
119
+
120
+ ### Running the Server
121
+
122
+ ```bash
123
+ # Install dependencies
124
+ pip install llama-cpp-python fastapi uvicorn pydantic
125
+
126
+ # Start server
127
+ python server.py
128
+
129
+ # Open browser
130
+ # http://localhost:7860
131
+ ```
132
+
133
+ ### API Endpoints
134
+
135
+ | Endpoint | Method | Description |
136
+ |----------|--------|-------------|
137
+ | `/` | GET | Web UI |
138
+ | `/health` | GET | Health check |
139
+ | `/info` | GET | API info |
140
+ | `/generate` | POST | Generate response |
141
+ | `/generate/stream` | POST | Stream tokens |
142
+
143
+ ### API Example
144
+
145
+ ```bash
146
+ # Generate code
147
+ curl -X POST http://localhost:7860/generate \
148
+ -H "Content-Type: application/json" \
149
+ -d '{"prompt": "Write hello world in python"}'
150
+ ```
151
+
152
+ ### Python Usage
153
+
154
+ ```python
155
+ from llama_cpp import Llama
156
+
157
+ llm = Llama(
158
+ model_path="blitzkode.gguf",
159
+ n_ctx=2048,
160
+ n_threads=8,
161
+ )
162
+
163
+ prompt = """<|im_start|>system
164
+ You are BlitzKode, a coding assistant.<|im_end|>
165
+ <|im_start|>user
166
+ Write a hello world in Python<|im_end|>
167
+ <|im_start|>assistant
168
+ """
169
+
170
+ result = llm(prompt, max_tokens=256)
171
+ print(result["choices"][0]["text"])
172
+ ```
173
+
174
+ ---
175
+
176
+ ## Prompt Format
177
+
178
+ Uses ChatML-style template:
179
+
180
+ ```
181
+ <|im_start|>system
182
+ You are BlitzKode, an AI coding assistant created by Sajad. You are an expert in Python, JavaScript, Java, C++, and other programming languages. Write clean, efficient, and well-documented code. Keep responses concise and practical.<|im_end|>
183
+ <|im_start|>user
184
+ {your prompt}<|im_end|>
185
+ <|im_start|>assistant
186
+ ```
187
+
188
+ ---
189
+
190
+ ## Configuration
191
+
192
+ The server supports environment variables:
193
+
194
+ | Variable | Default | Description |
195
+ |----------|---------|-------------|
196
+ | `BLITZKODE_MODEL_PATH` | `blitzkode.gguf` | Model file path |
197
+ | `BLITZKODE_FRONTEND_PATH` | `frontend/index.html` | UI path |
198
+ | `BLITZKODE_HOST` | `0.0.0.0` | Server host |
199
+ | `BLITZKODE_PORT` | `7860` | Server port |
200
+ | `BLITZKODE_THREADS` | CPU count | CPU threads |
201
+ | `BLITZKODE_N_CTX` | `2048` | Context window |
202
+ | `BLITZKODE_BATCH` | `128` | Batch size |
203
+ | `BLITZKODE_MAX_PROMPT_LENGTH` | `4000` | Max prompt chars |
204
+
205
+ ---
206
+
207
+ ## Limitations
208
+
209
+ - **Text-only input** - No image/vision support
210
+ - **2048 token context** - CPU-friendly but limited
211
+ - **Verify outputs** - Always review generated code before use
212
+ - **Small model** - May occasionally produce incorrect code
213
+
214
+ ---
215
+
216
+ ## Project Structure
217
+
218
+ ```
219
+ BlitzKode/
220
+ β”œβ”€β”€ server.py # FastAPI backend (v1.6)
221
+ β”œβ”€β”€ blitzkode.gguf # Quantized model (~3GB)
222
+ β”œβ”€β”€ frontend/
223
+ β”‚ └── index.html # Web UI
224
+ β”œβ”€β”€ tests/
225
+ β”‚ └── test_server.py # HTTP tests
226
+ β”œβ”€β”€ scripts/
227
+ β”‚ β”œβ”€β”€ train_sft.py # SFT training
228
+ β”‚ β”œβ”€β”€ train_grpo.py # GRPO training
229
+ β”‚ β”œβ”€β”€ train_dpo.py # DPO training
230
+ β”‚ β”œβ”€β”€ export_gguf.py # Model export
231
+ β”‚ └── test_inference.py # Inference test
232
+ β”œβ”€β”€ checkpoints/ # LoRA checkpoints
233
+ β”œβ”€β”€ datasets/ # Training data
234
+ β”œβ”€β”€ MODEL_CARD.md # This file
235
+ └── README.md # Project docs
236
+ ```
237
+
238
+ ---
239
+
240
+ ## Version History
241
+
242
+ | Version | Date | Changes |
243
+ |---------|------|---------|
244
+ | 1.6 | Current | CPU optimization, faster inference |
245
+ | 1.5 | Earlier | Added streaming support |
246
+ | 1.0 | Initial | Base model release |
247
+
248
+ ---
249
+
250
+ ## License
251
+
252
+ MIT License - See README.md for details.
253
+
254
+ Also comply with upstream Qwen base model license when redistributing.
255
+
256
+ ---
257
+
258
+ ## Contact
259
+
260
+ - **GitHub**: https://github.com/neuralbroker/blitzkode
261
+ - **Portfolio**: https://neuralbroker.vercel.app
262
+ - Issues and contributions welcome!
263
+
264
+ ---
265
+
266
+ ## Citation
267
+
268
+ ```bibtex
269
+ @software{blitzkode2026,
270
+ author = {Sajad},
271
+ title = {BlitzKode - AI Coding Assistant},
272
+ year = {2026},
273
+ url = {https://github.com/neuralbroker/blitzkode}
274
+ }
275
+ ```