🇧🇬 Why This Exists
"Every Bulgarian has the right to AI."
Right now, AI is a luxury. You need fast internet. You need expensive hardware. You need a subscription. If you're in a remote province on a 10-year-old PC, the 21st century is locked behind a paywall.
That's wrong. And Vitosha-GPT-Code is the answer.
🌍 The Problem We're Solving
❌ Remote area? → No cloud access
❌ Old PC / low RAM? → Runs too slow
❌ No fiber optic? → Can't stream tokens
❌ No credit card? → Locked out of ChatGPT
✅ Vitosha-GPT-Code → Works offline. 4GB RAM. Free. Forever.
A kid in a remote Bulgarian province deserves the same coding tools as a developer in Sofia. Whether it's building a website for the family business or learning to program for the first time — hardware should never be a barrier to entry.
🛠️ What It Does
Vitosha-GPT-Code is a Bulgarian-first coding assistant — it writes code, explains concepts, and answers technical questions in Bulgarian by default.
| Capability | Example |
|---|---|
| 🐍 Write Python functions | "Напиши функция за проверка на просто число" |
| 🌐 Build web projects | "Направи уебсайт за малък бизнес" |
| 🔁 Multi-turn coding chat | Remembers context across follow-up questions |
| 📖 Explain algorithms | "Обясни как работи binary search" |
| 🗃️ SQL queries | "Извлечи всички потребители от таблица users" |
| 🔧 Debug code | Spot errors and suggest fixes in Bulgarian |
🗣️ Real Example Output
Потребител: Напиши функция на Python, която проверява дали число е просто.
Виtoша: Ето една проста функция на Python, която проверява дали даденото
число е просто:
def is_prime(n):
if n <= 1:
return False
for i in range(2, int(n**0.5) + 1):
if n % i == 0:
return False
return True
Тази функция работи така: ако числото е по-малко или равно на 1, не е
просто. За всички останали — проверяваме всеки делител до корена на
числото...
✅ Correct code. Bulgarian explanation. Zero internet. Zero cost.
⚡ Quick Start
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base = "Qwen/Qwen2.5-Coder-7B-Instruct"
adapter = "kyleparrratt/Vitosha-GPT-Code"
tokenizer = AutoTokenizer.from_pretrained(base, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
base,
torch_dtype=torch.bfloat16,
device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter, is_trainable=False)
messages = [
{"role": "system", "content": "Ти си полезен асистент за програмиране. Отговаряш на български."},
{"role": "user", "content": "Напиши функция на Python за проверка на просто число и обясни на български."},
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=512, do_sample=False, pad_token_id=tokenizer.eos_token_id)
print(tokenizer.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
🔬 How It Was Built
This isn't just a wrapper with a Bulgarian flag slapped on it. It's purpose-trained from the ground up to think and respond in Bulgarian.
| Component | Detail |
|---|---|
| 🧠 Base Model | Qwen2.5-Coder-7B-Instruct — one of the strongest open code models available |
| 📚 Training Data | 5,000 Bulgarian coding examples from evol-codealpaca-v1 |
| 🌍 Translation | OPUS-MT (opus-mt-tc-big-en-bg) — dedicated EN→BG model, GPU-accelerated |
| ⚙️ Fine-tuning | LoRA (r=16) with Unsloth, efficient parameter training |
| 🚫 No tricks | No prompt poisoning, no phrase-stuffing — clean Bulgarian training targets |
| 🔒 Privacy | Designed for 100% local inference — your data never leaves your machine |
🗺️ Roadmap
[✅] V0.1 — LoRA adapter: Bulgarian code explanations & generation
[ ] V0.2 — 5,000-sample OPUS-translated training run + re-train
[ ] V0.3 — GGUF export: run with llama.cpp on 4GB RAM
[ ] V0.4 — Windows installer for offline use with no tech knowledge
[ ] V1.0 — Full offline Bulgarian coding assistant for every Bulgarian
⚠️ Current Limitations
- Occasional slip into English on complex explanations. Add "Отговори на български." to the user prompt to stay consistent.
- Code identifiers, API names, and variable names remain in English (as they should).
- V0.1 is trained on 400 samples; V0.2 will use 5,000.
🏔️ The Name
Vitosha is the mountain that watches over Sofia — visible from the capital, unchanging, accessible to everyone. It doesn't care if you're a professor or a student, if you have a new laptop or an old one. You can walk up Vitosha for free.
That's what this model is.
📄 License
MIT. Free to use, modify, and deploy. Kept free on purpose.
Built solo. Kept free. For every Bulgarian. 🇧🇬
"От Витоша, за всички."
- Downloads last month
- 59
