πŸ‡²πŸ‡° MK-LLM: The First Open Macedonian Language Model

🌍 About This Project

MK-LLM is Macedonia's first open-source Large Language Model (LLM), developed for the community, by the community. This project is led by AI Now - Association for Artificial Intelligence in Macedonia.

πŸ“Œ Website: www.ainow.mk
πŸ“© Contact: contact@ainow.mk
πŸ›  Model: MK-LLM-Mistral
πŸ’» GitHub: MK-LLM

πŸ†• Latest Updates (14.10.2025)

  • OpenAI-compatible endpoints: /v1/chat/completions, /v1/completions, /v1/models with JSON SSE streaming
  • QLoRA training pipeline (4-bit) with LoRA adapters and gradient checkpointing
  • Upgraded Macedonian data pipeline: cleaner extraction (trafilatura), gcld3 language filter, MinHash dedup
  • Gradio demo UI and improved FastAPI server (env-based config, lazy model load, quantization toggles)
  • Repository hygiene: LICENSE, model/dataset cards, Makefile, package inits, .gitkeep for data/models

πŸ“‚ Repository Structure

MK-LLM/
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ wikipedia/
β”‚   β”‚   β”œβ”€β”€ download_wiki.py
β”‚   β”‚   └── parse_wiki.py
β”‚   β”œβ”€β”€ cleaned/
β”‚   β”œβ”€β”€ processed/
β”‚   β”œβ”€β”€ raw/
β”‚   β”œβ”€β”€ tokenized/
β”‚   β”œβ”€β”€ eval/
β”‚   β”‚   └── mk_eval.jsonl
β”‚   β”œβ”€β”€ process_all_data.py
β”‚   └── clean_wikipedia.py
β”œβ”€β”€ examples/
β”‚   β”œβ”€β”€ client_python.py
β”‚   β”œβ”€β”€ client_js.mjs
β”‚   β”œβ”€β”€ data_loader.py
β”‚   └── train_mistral_mk.py
β”œβ”€β”€ inference/
β”‚   β”œβ”€β”€ api.py
β”‚   β”œβ”€β”€ gradio_app.py
β”‚   └── chatbot.py
β”œβ”€β”€ training/
β”‚   β”œβ”€β”€ train_pipeline.py
β”‚   └── fine_tune_mistral.py
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ preprocess_data.py
β”‚   └── evaluate.py
β”œβ”€β”€ configs/
β”‚   β”œβ”€β”€ train_small.yaml
β”‚   └── train_full.yaml
β”œβ”€β”€ tests/
β”‚   β”œβ”€β”€ test_api.py
β”‚   β”œβ”€β”€ test_model.py
β”‚   └── test_dataset.py
β”œβ”€β”€ docs/
β”‚   β”œβ”€β”€ EXTENDING.md
β”‚   └── GITHUB_ISSUES.md
β”œβ”€β”€ .github/
β”‚   β”œβ”€β”€ workflows/ci.yml
β”‚   β”œβ”€β”€ ISSUE_TEMPLATE/
β”‚   β”‚   β”œβ”€β”€ bug_report.yml
β”‚   β”‚   └── feature_request.yml
β”‚   └── PULL_REQUEST_TEMPLATE.md
β”œβ”€β”€ models/
β”œβ”€β”€ notebooks/
β”‚   └── evaluation.ipynb
β”œβ”€β”€ Dockerfile
β”œβ”€β”€ docker-compose.yml
β”œβ”€β”€ Makefile
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ constraints.txt
β”œβ”€β”€ LICENSE
β”œβ”€β”€ MODEL_CARD.md
β”œβ”€β”€ DATASET_CARD.md
β”œβ”€β”€ CODE_OF_CONDUCT.md
β”œβ”€β”€ SECURITY.md
└── README.md

Getting Started

  1. Clone the repository:
git clone https://github.com/AI-now-mk/MK-LLM.git
cd MK-LLM
  1. Install dependencies
pip install -r requirements.txt

Optional (recommended): use a virtual environment

python -m venv .venv
# Windows
.\.venv\Scripts\activate
# macOS/Linux
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r requirements.txt
  1. Configure environment (optional) Create a .env file in the project root:
HOST=0.0.0.0
PORT=8000
ALLOW_ORIGINS=*
MODEL_PATH=./models/mistral-finetuned-mk
MODEL_ID=mk-llm
TRUST_REMOTE_CODE=true
LOAD_IN_4BIT=false
LOAD_IN_8BIT=false
TORCH_DTYPE=float16
  1. Quick run: inference API
# Ensure a model exists at ./models/mistral-finetuned-mk (train or download)
python -m inference.api
# In another terminal, call the API
curl -X POST http://localhost:8000/generate \
  -H "Content-Type: application/json" \
  -d '{"prompt":"Π—Π΄Ρ€Π°Π²ΠΎ МакСдонија!", "max_new_tokens":128}'
  1. Optional: Gradio demo UI
python -m inference.gradio_app
# Open http://localhost:7860
  1. Prepare data (Macedonian)
# Download and extract Macedonian Wikipedia
python -m data.wikipedia.download_wiki
# Parse Wikipedia dump into clean text
python -m data.wikipedia.parse_wiki
# Collect web + combine + clean + mk language filter
python -m data.process_all_data
  1. Train (example)
python -m training.train_pipeline
# or
python -m training.fine_tune_mistral

Docker

Build and run the API with Docker:

docker build -t mk-llm .
docker run --gpus all -p 8000:8000 -e MODEL_PATH=./models/mistral-finetuned-mk mk-llm

Or via docker-compose:

docker-compose up --build

Continuous Integration

This repository includes a GitHub Actions CI to lint, type-check, and run tests on PRs/commits to main.

Constraints (reproducible installs)

To install with pinned versions:

pip install -r requirements.txt -c constraints.txt

OpenAI-compatible endpoints

This server exposes OpenAI-style routes so common clients (incl. gpt-oss-compatible tooling) can connect.

  • Chat Completions (streaming supported):
curl http://localhost:8000/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "mk-llm",
    "messages": [
      {"role": "system", "content": "Π’ΠΈ си помошник кој Π·Π±ΠΎΡ€ΡƒΠ²Π° Π½Π° макСдонски."},
      {"role": "user", "content": "Која Π΅ ΠΈΡΡ‚ΠΎΡ€ΠΈΡ˜Π°Ρ‚Π° Π½Π° ΠžΡ…Ρ€ΠΈΠ΄?"}
    ],
    "stream": false
  }'
  • Text Completions:
curl http://localhost:8000/v1/completions \
  -H 'Content-Type: application/json' \
  -d '{
    "prompt": "Π—Π΄Ρ€Π°Π²ΠΎ МакСдонија!",
    "max_tokens": 128
  }'

Related project: OpenAI gpt-oss (open-weight models, client compatibility notes). See https://github.com/openai/gpt-oss.

Use with gpt-oss-compatible clients

Point any OpenAI-compatible client to this server.

Example (Python OpenAI SDK environment):

export OPENAI_API_KEY=dummy
export OPENAI_BASE_URL=http://localhost:8000/v1

Example Chat Completions (curl):

curl "$OPENAI_BASE_URL/chat/completions" \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "mk-llm",
    "messages": [
      {"role": "system", "content": "Π’ΠΈ си помошник кој Π·Π±ΠΎΡ€ΡƒΠ²Π° Π½Π° макСдонски."},
      {"role": "user", "content": "Која Π΅ ΠΈΡΡ‚ΠΎΡ€ΠΈΡ˜Π°Ρ‚Π° Π½Π° ΠžΡ…Ρ€ΠΈΠ΄?"}
    ],
    "stream": true
  }'
Downloads last month
-
Safetensors
Model size
0.3B params
Tensor type
F16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for ainowmk/MK-LLM-Mistral

Finetuned
(990)
this model

Dataset used to train ainowmk/MK-LLM-Mistral