File size: 6,741 Bytes
cfb448d 93b2b72 cfb448d 93b2b72 cfb448d 93b2b72 cfb448d 93b2b72 cfb448d 93b2b72 cfb448d 6d29a06 93b2b72 cfb448d 93b2b72 cfb448d 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 b999e09 93b2b72 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 | ---
language:
- bg
- en
license: mit
base_model: Qwen/Qwen2.5-Coder-7B-Instruct
tags:
- code
- bulgarian
- lora
- peft
- vitosha-gpt-code
- slm
- offline
---
<div align="center">
# 🏔️ Vitosha-GPT-Code
### *The coding assistant that speaks Bulgarian — and runs anywhere.*

[](https://opensource.org/licenses/MIT)
[](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct)
[](https://huggingface.co/kyleparrratt/Vitosha-GPT-Code)
[](https://huggingface.co/kyleparrratt/Vitosha-GPT-Code)
[](https://huggingface.co/kyleparrratt/Vitosha-GPT-Code)
[](https://huggingface.co/kyleparrratt/Vitosha-GPT-Code)
</div>
---
## 🇧🇬 Why This Exists
> *"Every Bulgarian has the right to AI."*
Right now, AI is a luxury. You need fast internet. You need expensive hardware. You need a subscription. If you're in a remote province on a 10-year-old PC, the 21st century is locked behind a paywall.
**That's wrong. And Vitosha-GPT-Code is the answer.**
---
## 🌍 The Problem We're Solving
```
❌ Remote area? → No cloud access
❌ Old PC / low RAM? → Runs too slow
❌ No fiber optic? → Can't stream tokens
❌ No credit card? → Locked out of ChatGPT
✅ Vitosha-GPT-Code → Works offline. 4GB RAM. Free. Forever.
```
A kid in a remote Bulgarian province deserves the **same coding tools** as a developer in Sofia. Whether it's building a website for the family business or learning to program for the first time — **hardware should never be a barrier to entry**.
---
## 🛠️ What It Does
Vitosha-GPT-Code is a **Bulgarian-first coding assistant** — it writes code, explains concepts, and answers technical questions **in Bulgarian by default**.
| Capability | Example |
|---|---|
| 🐍 Write Python functions | "Напиши функция за проверка на просто число" |
| 🌐 Build web projects | "Направи уебсайт за малък бизнес" |
| 🔁 Multi-turn coding chat | Remembers context across follow-up questions |
| 📖 Explain algorithms | "Обясни как работи binary search" |
| 🗃️ SQL queries | "Извлечи всички потребители от таблица users" |
| 🔧 Debug code | Spot errors and suggest fixes in Bulgarian |
### 🗣️ Real Example Output
```
Потребител: Напиши функция на Python, която проверява дали число е просто.
Виtoша: Ето една проста функция на Python, която проверява дали даденото
число е просто:
def is_prime(n):
if n <= 1:
return False
for i in range(2, int(n**0.5) + 1):
if n % i == 0:
return False
return True
Тази функция работи така: ако числото е по-малко или равно на 1, не е
просто. За всички останали — проверяваме всеки делител до корена на
числото...
```
✅ **Correct code. Bulgarian explanation. Zero internet. Zero cost.**
---
## ⚡ Quick Start
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base = "Qwen/Qwen2.5-Coder-7B-Instruct"
adapter = "kyleparrratt/Vitosha-GPT-Code"
tokenizer = AutoTokenizer.from_pretrained(base, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
base,
torch_dtype=torch.bfloat16,
device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter, is_trainable=False)
messages = [
{"role": "system", "content": "Ти си полезен асистент за програмиране. Отговаряш на български."},
{"role": "user", "content": "Напиши функция на Python за проверка на просто число и обясни на български."},
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=512, do_sample=False, pad_token_id=tokenizer.eos_token_id)
print(tokenizer.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
```
---
## 🔬 How It Was Built
This isn't just a wrapper with a Bulgarian flag slapped on it. It's purpose-trained from the ground up to think and respond in Bulgarian.
| Component | Detail |
|---|---|
| 🧠 **Base Model** | Qwen2.5-Coder-7B-Instruct — one of the strongest open code models available |
| 📚 **Training Data** | 5,000 Bulgarian coding examples from evol-codealpaca-v1 |
| 🌍 **Translation** | OPUS-MT (opus-mt-tc-big-en-bg) — dedicated EN→BG model, GPU-accelerated |
| ⚙️ **Fine-tuning** | LoRA (r=16) with Unsloth, efficient parameter training |
| 🚫 **No tricks** | No prompt poisoning, no phrase-stuffing — clean Bulgarian training targets |
| 🔒 **Privacy** | Designed for 100% local inference — your data never leaves your machine |
---
## 🗺️ Roadmap
```
[✅] V0.1 — LoRA adapter: Bulgarian code explanations & generation
[ ] V0.2 — 5,000-sample OPUS-translated training run + re-train
[ ] V0.3 — GGUF export: run with llama.cpp on 4GB RAM
[ ] V0.4 — Windows installer for offline use with no tech knowledge
[ ] V1.0 — Full offline Bulgarian coding assistant for every Bulgarian
```
---
## ⚠️ Current Limitations
- Occasional slip into English on complex explanations. Add *"Отговори на български."* to the user prompt to stay consistent.
- Code identifiers, API names, and variable names remain in English (as they should).
- V0.1 is trained on 400 samples; V0.2 will use 5,000.
---
## 🏔️ The Name
**Vitosha** is the mountain that watches over Sofia — visible from the capital, unchanging, accessible to everyone. It doesn't care if you're a professor or a student, if you have a new laptop or an old one. You can walk up Vitosha for free.
That's what this model is.
---
## 📄 License
MIT. Free to use, modify, and deploy. Kept free on purpose.
---
<div align="center">
**Built solo. Kept free. For every Bulgarian. 🇧🇬**
*"От Витоша, за всички."*
</div>
|