|
|
--- |
|
|
language: |
|
|
- mk |
|
|
license: apache-2.0 |
|
|
library_name: transformers |
|
|
pipeline_tag: text-generation |
|
|
tags: |
|
|
- macedonian |
|
|
- cyrillic |
|
|
- mistral |
|
|
- qlora |
|
|
- peft |
|
|
base_model: mistralai/Mistral-7B-v0.1 |
|
|
datasets: |
|
|
- ainowmk/MK-LLM-Mistral-data |
|
|
metrics: |
|
|
- perplexity |
|
|
pretty_name: MK-LLM (Mistral) |
|
|
model-index: |
|
|
- name: MK-LLM-Mistral |
|
|
results: [] |
|
|
--- |
|
|
# 🇲🇰 MK-LLM: The First Open Macedonian Language Model |
|
|
|
|
|
## 🌍 About This Project |
|
|
MK-LLM is Macedonia's first open-source Large Language Model (LLM), developed for the community, by the community. This project is led by AI Now - Association for Artificial Intelligence in Macedonia. |
|
|
|
|
|
📌 **Website:** [www.ainow.mk](https://www.ainow.mk) |
|
|
📩 **Contact:** [contact@ainow.mk](mailto:contact@ainow.mk) |
|
|
🛠 **Model:** [MK-LLM-Mistral](https://huggingface.co/ainowmk/MK-LLM-Mistral) |
|
|
💻 **GitHub:** [MK-LLM](https://github.com/AI-now-mk/MK-LLM) |
|
|
|
|
|
## 🆕 Latest Updates (14.10.2025) |
|
|
- OpenAI-compatible endpoints: `/v1/chat/completions`, `/v1/completions`, `/v1/models` with JSON SSE streaming |
|
|
- QLoRA training pipeline (4-bit) with LoRA adapters and gradient checkpointing |
|
|
- Upgraded Macedonian data pipeline: cleaner extraction (trafilatura), gcld3 language filter, MinHash dedup |
|
|
- Gradio demo UI and improved FastAPI server (env-based config, lazy model load, quantization toggles) |
|
|
- Repository hygiene: LICENSE, model/dataset cards, Makefile, package inits, `.gitkeep` for data/models |
|
|
|
|
|
## 📂 Repository Structure |
|
|
```plaintext |
|
|
MK-LLM/ |
|
|
├── data/ |
|
|
│ ├── wikipedia/ |
|
|
│ │ ├── download_wiki.py |
|
|
│ │ └── parse_wiki.py |
|
|
│ ├── cleaned/ |
|
|
│ ├── processed/ |
|
|
│ ├── raw/ |
|
|
│ ├── tokenized/ |
|
|
│ ├── eval/ |
|
|
│ │ └── mk_eval.jsonl |
|
|
│ ├── process_all_data.py |
|
|
│ └── clean_wikipedia.py |
|
|
├── examples/ |
|
|
│ ├── client_python.py |
|
|
│ ├── client_js.mjs |
|
|
│ ├── data_loader.py |
|
|
│ └── train_mistral_mk.py |
|
|
├── inference/ |
|
|
│ ├── api.py |
|
|
│ ├── gradio_app.py |
|
|
│ └── chatbot.py |
|
|
├── training/ |
|
|
│ ├── train_pipeline.py |
|
|
│ └── fine_tune_mistral.py |
|
|
├── scripts/ |
|
|
│ ├── preprocess_data.py |
|
|
│ └── evaluate.py |
|
|
├── configs/ |
|
|
│ ├── train_small.yaml |
|
|
│ └── train_full.yaml |
|
|
├── tests/ |
|
|
│ ├── test_api.py |
|
|
│ ├── test_model.py |
|
|
│ └── test_dataset.py |
|
|
├── docs/ |
|
|
│ ├── EXTENDING.md |
|
|
│ └── GITHUB_ISSUES.md |
|
|
├── .github/ |
|
|
│ ├── workflows/ci.yml |
|
|
│ ├── ISSUE_TEMPLATE/ |
|
|
│ │ ├── bug_report.yml |
|
|
│ │ └── feature_request.yml |
|
|
│ └── PULL_REQUEST_TEMPLATE.md |
|
|
├── models/ |
|
|
├── notebooks/ |
|
|
│ └── evaluation.ipynb |
|
|
├── Dockerfile |
|
|
├── docker-compose.yml |
|
|
├── Makefile |
|
|
├── requirements.txt |
|
|
├── constraints.txt |
|
|
├── LICENSE |
|
|
├── MODEL_CARD.md |
|
|
├── DATASET_CARD.md |
|
|
├── CODE_OF_CONDUCT.md |
|
|
├── SECURITY.md |
|
|
└── README.md |
|
|
``` |
|
|
|
|
|
## Getting Started |
|
|
1. Clone the repository: |
|
|
```bash |
|
|
git clone https://github.com/AI-now-mk/MK-LLM.git |
|
|
cd MK-LLM |
|
|
``` |
|
|
|
|
|
2) Install dependencies |
|
|
```bash |
|
|
pip install -r requirements.txt |
|
|
``` |
|
|
|
|
|
Optional (recommended): use a virtual environment |
|
|
```bash |
|
|
python -m venv .venv |
|
|
# Windows |
|
|
.\.venv\Scripts\activate |
|
|
# macOS/Linux |
|
|
source .venv/bin/activate |
|
|
python -m pip install --upgrade pip |
|
|
pip install -r requirements.txt |
|
|
``` |
|
|
|
|
|
3) Configure environment (optional) |
|
|
Create a `.env` file in the project root: |
|
|
```bash |
|
|
HOST=0.0.0.0 |
|
|
PORT=8000 |
|
|
ALLOW_ORIGINS=* |
|
|
MODEL_PATH=./models/mistral-finetuned-mk |
|
|
MODEL_ID=mk-llm |
|
|
TRUST_REMOTE_CODE=true |
|
|
LOAD_IN_4BIT=false |
|
|
LOAD_IN_8BIT=false |
|
|
TORCH_DTYPE=float16 |
|
|
``` |
|
|
|
|
|
4) Quick run: inference API |
|
|
```bash |
|
|
# Ensure a model exists at ./models/mistral-finetuned-mk (train or download) |
|
|
python -m inference.api |
|
|
# In another terminal, call the API |
|
|
curl -X POST http://localhost:8000/generate \ |
|
|
-H "Content-Type: application/json" \ |
|
|
-d '{"prompt":"Здраво Македонија!", "max_new_tokens":128}' |
|
|
``` |
|
|
|
|
|
5) Optional: Gradio demo UI |
|
|
```bash |
|
|
python -m inference.gradio_app |
|
|
# Open http://localhost:7860 |
|
|
``` |
|
|
|
|
|
6) Prepare data (Macedonian) |
|
|
```bash |
|
|
# Download and extract Macedonian Wikipedia |
|
|
python -m data.wikipedia.download_wiki |
|
|
# Parse Wikipedia dump into clean text |
|
|
python -m data.wikipedia.parse_wiki |
|
|
# Collect web + combine + clean + mk language filter |
|
|
python -m data.process_all_data |
|
|
``` |
|
|
|
|
|
7) Train (example) |
|
|
```bash |
|
|
python -m training.train_pipeline |
|
|
# or |
|
|
python -m training.fine_tune_mistral |
|
|
``` |
|
|
|
|
|
### Docker |
|
|
Build and run the API with Docker: |
|
|
```bash |
|
|
docker build -t mk-llm . |
|
|
docker run --gpus all -p 8000:8000 -e MODEL_PATH=./models/mistral-finetuned-mk mk-llm |
|
|
``` |
|
|
|
|
|
Or via docker-compose: |
|
|
```bash |
|
|
docker-compose up --build |
|
|
``` |
|
|
|
|
|
### Continuous Integration |
|
|
This repository includes a GitHub Actions CI to lint, type-check, and run tests on PRs/commits to `main`. |
|
|
|
|
|
### Constraints (reproducible installs) |
|
|
To install with pinned versions: |
|
|
```bash |
|
|
pip install -r requirements.txt -c constraints.txt |
|
|
``` |
|
|
|
|
|
### OpenAI-compatible endpoints |
|
|
This server exposes OpenAI-style routes so common clients (incl. gpt-oss-compatible tooling) can connect. |
|
|
|
|
|
- Chat Completions (streaming supported): |
|
|
```bash |
|
|
curl http://localhost:8000/v1/chat/completions \ |
|
|
-H 'Content-Type: application/json' \ |
|
|
-d '{ |
|
|
"model": "mk-llm", |
|
|
"messages": [ |
|
|
{"role": "system", "content": "Ти си помошник кој зборува на македонски."}, |
|
|
{"role": "user", "content": "Која е историјата на Охрид?"} |
|
|
], |
|
|
"stream": false |
|
|
}' |
|
|
``` |
|
|
|
|
|
- Text Completions: |
|
|
```bash |
|
|
curl http://localhost:8000/v1/completions \ |
|
|
-H 'Content-Type: application/json' \ |
|
|
-d '{ |
|
|
"prompt": "Здраво Македонија!", |
|
|
"max_tokens": 128 |
|
|
}' |
|
|
``` |
|
|
|
|
|
Related project: OpenAI gpt-oss (open-weight models, client compatibility notes). See `https://github.com/openai/gpt-oss`. |
|
|
|
|
|
### Use with gpt-oss-compatible clients |
|
|
Point any OpenAI-compatible client to this server. |
|
|
|
|
|
Example (Python OpenAI SDK environment): |
|
|
```bash |
|
|
export OPENAI_API_KEY=dummy |
|
|
export OPENAI_BASE_URL=http://localhost:8000/v1 |
|
|
``` |
|
|
|
|
|
Example Chat Completions (curl): |
|
|
```bash |
|
|
curl "$OPENAI_BASE_URL/chat/completions" \ |
|
|
-H 'Content-Type: application/json' \ |
|
|
-H "Authorization: Bearer $OPENAI_API_KEY" \ |
|
|
-d '{ |
|
|
"model": "mk-llm", |
|
|
"messages": [ |
|
|
{"role": "system", "content": "Ти си помошник кој зборува на македонски."}, |
|
|
{"role": "user", "content": "Која е историјата на Охрид?"} |
|
|
], |
|
|
"stream": true |
|
|
}' |
|
|
``` |
|
|
|