🏔️ Vitosha-GPT-Code

The coding assistant that speaks Bulgarian — and runs anywhere.

Vitosha-GPT-Code Banner

License: MIT Base Model Language Offline RAM Status


🇧🇬 Why This Exists

"Every Bulgarian has the right to AI."

Right now, AI is a luxury. You need fast internet. You need expensive hardware. You need a subscription. If you're in a remote province on a 10-year-old PC, the 21st century is locked behind a paywall.

That's wrong. And Vitosha-GPT-Code is the answer.


🌍 The Problem We're Solving

❌  Remote area?          → No cloud access
❌  Old PC / low RAM?     → Runs too slow  
❌  No fiber optic?       → Can't stream tokens
❌  No credit card?       → Locked out of ChatGPT

✅  Vitosha-GPT-Code      → Works offline. 4GB RAM. Free. Forever.

A kid in a remote Bulgarian province deserves the same coding tools as a developer in Sofia. Whether it's building a website for the family business or learning to program for the first time — hardware should never be a barrier to entry.


🛠️ What It Does

Vitosha-GPT-Code is a Bulgarian-first coding assistant — it writes code, explains concepts, and answers technical questions in Bulgarian by default.

Capability Example
🐍 Write Python functions "Напиши функция за проверка на просто число"
🌐 Build web projects "Направи уебсайт за малък бизнес"
🔁 Multi-turn coding chat Remembers context across follow-up questions
📖 Explain algorithms "Обясни как работи binary search"
🗃️ SQL queries "Извлечи всички потребители от таблица users"
🔧 Debug code Spot errors and suggest fixes in Bulgarian

🗣️ Real Example Output

Потребител: Напиши функция на Python, която проверява дали число е просто.

Виtoша: Ето една проста функция на Python, която проверява дали даденото
число е просто:

def is_prime(n):
    if n <= 1:
        return False
    for i in range(2, int(n**0.5) + 1):
        if n % i == 0:
            return False
    return True

Тази функция работи така: ако числото е по-малко или равно на 1, не е
просто. За всички останали — проверяваме всеки делител до корена на
числото...

Correct code. Bulgarian explanation. Zero internet. Zero cost.


⚡ Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base = "Qwen/Qwen2.5-Coder-7B-Instruct"
adapter = "kyleparrratt/Vitosha-GPT-Code"

tokenizer = AutoTokenizer.from_pretrained(base, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    base,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter, is_trainable=False)

messages = [
    {"role": "system", "content": "Ти си полезен асистент за програмиране. Отговаряш на български."},
    {"role": "user", "content": "Напиши функция на Python за проверка на просто число и обясни на български."},
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=512, do_sample=False, pad_token_id=tokenizer.eos_token_id)
print(tokenizer.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))

🔬 How It Was Built

This isn't just a wrapper with a Bulgarian flag slapped on it. It's purpose-trained from the ground up to think and respond in Bulgarian.

Component Detail
🧠 Base Model Qwen2.5-Coder-7B-Instruct — one of the strongest open code models available
📚 Training Data 5,000 Bulgarian coding examples from evol-codealpaca-v1
🌍 Translation OPUS-MT (opus-mt-tc-big-en-bg) — dedicated EN→BG model, GPU-accelerated
⚙️ Fine-tuning LoRA (r=16) with Unsloth, efficient parameter training
🚫 No tricks No prompt poisoning, no phrase-stuffing — clean Bulgarian training targets
🔒 Privacy Designed for 100% local inference — your data never leaves your machine

🗺️ Roadmap

[✅] V0.1 — LoRA adapter: Bulgarian code explanations & generation
[ ] V0.2 — 5,000-sample OPUS-translated training run + re-train
[ ] V0.3 — GGUF export: run with llama.cpp on 4GB RAM
[ ] V0.4 — Windows installer for offline use with no tech knowledge
[ ] V1.0 — Full offline Bulgarian coding assistant for every Bulgarian

⚠️ Current Limitations

  • Occasional slip into English on complex explanations. Add "Отговори на български." to the user prompt to stay consistent.
  • Code identifiers, API names, and variable names remain in English (as they should).
  • V0.1 is trained on 400 samples; V0.2 will use 5,000.

🏔️ The Name

Vitosha is the mountain that watches over Sofia — visible from the capital, unchanging, accessible to everyone. It doesn't care if you're a professor or a student, if you have a new laptop or an old one. You can walk up Vitosha for free.

That's what this model is.


📄 License

MIT. Free to use, modify, and deploy. Kept free on purpose.


Built solo. Kept free. For every Bulgarian. 🇧🇬

"От Витоша, за всички."

Downloads last month
59
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kyleparrratt/Vitosha-GPT-Code

Base model

Qwen/Qwen2.5-7B
Adapter
(497)
this model