π²π° MK-LLM: The First Open Macedonian Language Model
π About This Project
MK-LLM is Macedonia's first open-source Large Language Model (LLM), developed for the community, by the community. This project is led by AI Now - Association for Artificial Intelligence in Macedonia.
π Website: www.ainow.mk
π© Contact: contact@ainow.mk
π Model: MK-LLM-Mistral
π» GitHub: MK-LLM
π Latest Updates (14.10.2025)
- OpenAI-compatible endpoints:
/v1/chat/completions,/v1/completions,/v1/modelswith JSON SSE streaming - QLoRA training pipeline (4-bit) with LoRA adapters and gradient checkpointing
- Upgraded Macedonian data pipeline: cleaner extraction (trafilatura), gcld3 language filter, MinHash dedup
- Gradio demo UI and improved FastAPI server (env-based config, lazy model load, quantization toggles)
- Repository hygiene: LICENSE, model/dataset cards, Makefile, package inits,
.gitkeepfor data/models
π Repository Structure
MK-LLM/
βββ data/
β βββ wikipedia/
β β βββ download_wiki.py
β β βββ parse_wiki.py
β βββ cleaned/
β βββ processed/
β βββ raw/
β βββ tokenized/
β βββ eval/
β β βββ mk_eval.jsonl
β βββ process_all_data.py
β βββ clean_wikipedia.py
βββ examples/
β βββ client_python.py
β βββ client_js.mjs
β βββ data_loader.py
β βββ train_mistral_mk.py
βββ inference/
β βββ api.py
β βββ gradio_app.py
β βββ chatbot.py
βββ training/
β βββ train_pipeline.py
β βββ fine_tune_mistral.py
βββ scripts/
β βββ preprocess_data.py
β βββ evaluate.py
βββ configs/
β βββ train_small.yaml
β βββ train_full.yaml
βββ tests/
β βββ test_api.py
β βββ test_model.py
β βββ test_dataset.py
βββ docs/
β βββ EXTENDING.md
β βββ GITHUB_ISSUES.md
βββ .github/
β βββ workflows/ci.yml
β βββ ISSUE_TEMPLATE/
β β βββ bug_report.yml
β β βββ feature_request.yml
β βββ PULL_REQUEST_TEMPLATE.md
βββ models/
βββ notebooks/
β βββ evaluation.ipynb
βββ Dockerfile
βββ docker-compose.yml
βββ Makefile
βββ requirements.txt
βββ constraints.txt
βββ LICENSE
βββ MODEL_CARD.md
βββ DATASET_CARD.md
βββ CODE_OF_CONDUCT.md
βββ SECURITY.md
βββ README.md
Getting Started
- Clone the repository:
git clone https://github.com/AI-now-mk/MK-LLM.git
cd MK-LLM
- Install dependencies
pip install -r requirements.txt
Optional (recommended): use a virtual environment
python -m venv .venv
# Windows
.\.venv\Scripts\activate
# macOS/Linux
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r requirements.txt
- Configure environment (optional)
Create a
.envfile in the project root:
HOST=0.0.0.0
PORT=8000
ALLOW_ORIGINS=*
MODEL_PATH=./models/mistral-finetuned-mk
MODEL_ID=mk-llm
TRUST_REMOTE_CODE=true
LOAD_IN_4BIT=false
LOAD_IN_8BIT=false
TORCH_DTYPE=float16
- Quick run: inference API
# Ensure a model exists at ./models/mistral-finetuned-mk (train or download)
python -m inference.api
# In another terminal, call the API
curl -X POST http://localhost:8000/generate \
-H "Content-Type: application/json" \
-d '{"prompt":"ΠΠ΄ΡΠ°Π²ΠΎ ΠΠ°ΠΊΠ΅Π΄ΠΎΠ½ΠΈΡΠ°!", "max_new_tokens":128}'
- Optional: Gradio demo UI
python -m inference.gradio_app
# Open http://localhost:7860
- Prepare data (Macedonian)
# Download and extract Macedonian Wikipedia
python -m data.wikipedia.download_wiki
# Parse Wikipedia dump into clean text
python -m data.wikipedia.parse_wiki
# Collect web + combine + clean + mk language filter
python -m data.process_all_data
- Train (example)
python -m training.train_pipeline
# or
python -m training.fine_tune_mistral
Docker
Build and run the API with Docker:
docker build -t mk-llm .
docker run --gpus all -p 8000:8000 -e MODEL_PATH=./models/mistral-finetuned-mk mk-llm
Or via docker-compose:
docker-compose up --build
Continuous Integration
This repository includes a GitHub Actions CI to lint, type-check, and run tests on PRs/commits to main.
Constraints (reproducible installs)
To install with pinned versions:
pip install -r requirements.txt -c constraints.txt
OpenAI-compatible endpoints
This server exposes OpenAI-style routes so common clients (incl. gpt-oss-compatible tooling) can connect.
- Chat Completions (streaming supported):
curl http://localhost:8000/v1/chat/completions \
-H 'Content-Type: application/json' \
-d '{
"model": "mk-llm",
"messages": [
{"role": "system", "content": "Π’ΠΈ ΡΠΈ ΠΏΠΎΠΌΠΎΡΠ½ΠΈΠΊ ΠΊΠΎΡ Π·Π±ΠΎΡΡΠ²Π° Π½Π° ΠΌΠ°ΠΊΠ΅Π΄ΠΎΠ½ΡΠΊΠΈ."},
{"role": "user", "content": "ΠΠΎΡΠ° Π΅ ΠΈΡΡΠΎΡΠΈΡΠ°ΡΠ° Π½Π° ΠΡ
ΡΠΈΠ΄?"}
],
"stream": false
}'
- Text Completions:
curl http://localhost:8000/v1/completions \
-H 'Content-Type: application/json' \
-d '{
"prompt": "ΠΠ΄ΡΠ°Π²ΠΎ ΠΠ°ΠΊΠ΅Π΄ΠΎΠ½ΠΈΡΠ°!",
"max_tokens": 128
}'
Related project: OpenAI gpt-oss (open-weight models, client compatibility notes). See https://github.com/openai/gpt-oss.
Use with gpt-oss-compatible clients
Point any OpenAI-compatible client to this server.
Example (Python OpenAI SDK environment):
export OPENAI_API_KEY=dummy
export OPENAI_BASE_URL=http://localhost:8000/v1
Example Chat Completions (curl):
curl "$OPENAI_BASE_URL/chat/completions" \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "mk-llm",
"messages": [
{"role": "system", "content": "Π’ΠΈ ΡΠΈ ΠΏΠΎΠΌΠΎΡΠ½ΠΈΠΊ ΠΊΠΎΡ Π·Π±ΠΎΡΡΠ²Π° Π½Π° ΠΌΠ°ΠΊΠ΅Π΄ΠΎΠ½ΡΠΊΠΈ."},
{"role": "user", "content": "ΠΠΎΡΠ° Π΅ ΠΈΡΡΠΎΡΠΈΡΠ°ΡΠ° Π½Π° ΠΡ
ΡΠΈΠ΄?"}
],
"stream": true
}'
- Downloads last month
- -
Model tree for ainowmk/MK-LLM-Mistral
Base model
mistralai/Mistral-7B-v0.1