MK-LLM-Mistral / README.md
ainow-mk's picture
Update README.md
9f919d1 verified
---
language:
- mk
license: apache-2.0
library_name: transformers
pipeline_tag: text-generation
tags:
- macedonian
- cyrillic
- mistral
- qlora
- peft
base_model: mistralai/Mistral-7B-v0.1
datasets:
- ainowmk/MK-LLM-Mistral-data
metrics:
- perplexity
pretty_name: MK-LLM (Mistral)
model-index:
- name: MK-LLM-Mistral
results: []
---
# 🇲🇰 MK-LLM: The First Open Macedonian Language Model
## 🌍 About This Project
MK-LLM is Macedonia's first open-source Large Language Model (LLM), developed for the community, by the community. This project is led by AI Now - Association for Artificial Intelligence in Macedonia.
📌 **Website:** [www.ainow.mk](https://www.ainow.mk)
📩 **Contact:** [contact@ainow.mk](mailto:contact@ainow.mk)
🛠 **Model:** [MK-LLM-Mistral](https://huggingface.co/ainowmk/MK-LLM-Mistral)
💻 **GitHub:** [MK-LLM](https://github.com/AI-now-mk/MK-LLM)
## 🆕 Latest Updates (14.10.2025)
- OpenAI-compatible endpoints: `/v1/chat/completions`, `/v1/completions`, `/v1/models` with JSON SSE streaming
- QLoRA training pipeline (4-bit) with LoRA adapters and gradient checkpointing
- Upgraded Macedonian data pipeline: cleaner extraction (trafilatura), gcld3 language filter, MinHash dedup
- Gradio demo UI and improved FastAPI server (env-based config, lazy model load, quantization toggles)
- Repository hygiene: LICENSE, model/dataset cards, Makefile, package inits, `.gitkeep` for data/models
## 📂 Repository Structure
```plaintext
MK-LLM/
├── data/
│ ├── wikipedia/
│ │ ├── download_wiki.py
│ │ └── parse_wiki.py
│ ├── cleaned/
│ ├── processed/
│ ├── raw/
│ ├── tokenized/
│ ├── eval/
│ │ └── mk_eval.jsonl
│ ├── process_all_data.py
│ └── clean_wikipedia.py
├── examples/
│ ├── client_python.py
│ ├── client_js.mjs
│ ├── data_loader.py
│ └── train_mistral_mk.py
├── inference/
│ ├── api.py
│ ├── gradio_app.py
│ └── chatbot.py
├── training/
│ ├── train_pipeline.py
│ └── fine_tune_mistral.py
├── scripts/
│ ├── preprocess_data.py
│ └── evaluate.py
├── configs/
│ ├── train_small.yaml
│ └── train_full.yaml
├── tests/
│ ├── test_api.py
│ ├── test_model.py
│ └── test_dataset.py
├── docs/
│ ├── EXTENDING.md
│ └── GITHUB_ISSUES.md
├── .github/
│ ├── workflows/ci.yml
│ ├── ISSUE_TEMPLATE/
│ │ ├── bug_report.yml
│ │ └── feature_request.yml
│ └── PULL_REQUEST_TEMPLATE.md
├── models/
├── notebooks/
│ └── evaluation.ipynb
├── Dockerfile
├── docker-compose.yml
├── Makefile
├── requirements.txt
├── constraints.txt
├── LICENSE
├── MODEL_CARD.md
├── DATASET_CARD.md
├── CODE_OF_CONDUCT.md
├── SECURITY.md
└── README.md
```
## Getting Started
1. Clone the repository:
```bash
git clone https://github.com/AI-now-mk/MK-LLM.git
cd MK-LLM
```
2) Install dependencies
```bash
pip install -r requirements.txt
```
Optional (recommended): use a virtual environment
```bash
python -m venv .venv
# Windows
.\.venv\Scripts\activate
# macOS/Linux
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r requirements.txt
```
3) Configure environment (optional)
Create a `.env` file in the project root:
```bash
HOST=0.0.0.0
PORT=8000
ALLOW_ORIGINS=*
MODEL_PATH=./models/mistral-finetuned-mk
MODEL_ID=mk-llm
TRUST_REMOTE_CODE=true
LOAD_IN_4BIT=false
LOAD_IN_8BIT=false
TORCH_DTYPE=float16
```
4) Quick run: inference API
```bash
# Ensure a model exists at ./models/mistral-finetuned-mk (train or download)
python -m inference.api
# In another terminal, call the API
curl -X POST http://localhost:8000/generate \
-H "Content-Type: application/json" \
-d '{"prompt":"Здраво Македонија!", "max_new_tokens":128}'
```
5) Optional: Gradio demo UI
```bash
python -m inference.gradio_app
# Open http://localhost:7860
```
6) Prepare data (Macedonian)
```bash
# Download and extract Macedonian Wikipedia
python -m data.wikipedia.download_wiki
# Parse Wikipedia dump into clean text
python -m data.wikipedia.parse_wiki
# Collect web + combine + clean + mk language filter
python -m data.process_all_data
```
7) Train (example)
```bash
python -m training.train_pipeline
# or
python -m training.fine_tune_mistral
```
### Docker
Build and run the API with Docker:
```bash
docker build -t mk-llm .
docker run --gpus all -p 8000:8000 -e MODEL_PATH=./models/mistral-finetuned-mk mk-llm
```
Or via docker-compose:
```bash
docker-compose up --build
```
### Continuous Integration
This repository includes a GitHub Actions CI to lint, type-check, and run tests on PRs/commits to `main`.
### Constraints (reproducible installs)
To install with pinned versions:
```bash
pip install -r requirements.txt -c constraints.txt
```
### OpenAI-compatible endpoints
This server exposes OpenAI-style routes so common clients (incl. gpt-oss-compatible tooling) can connect.
- Chat Completions (streaming supported):
```bash
curl http://localhost:8000/v1/chat/completions \
-H 'Content-Type: application/json' \
-d '{
"model": "mk-llm",
"messages": [
{"role": "system", "content": "Ти си помошник кој зборува на македонски."},
{"role": "user", "content": "Која е историјата на Охрид?"}
],
"stream": false
}'
```
- Text Completions:
```bash
curl http://localhost:8000/v1/completions \
-H 'Content-Type: application/json' \
-d '{
"prompt": "Здраво Македонија!",
"max_tokens": 128
}'
```
Related project: OpenAI gpt-oss (open-weight models, client compatibility notes). See `https://github.com/openai/gpt-oss`.
### Use with gpt-oss-compatible clients
Point any OpenAI-compatible client to this server.
Example (Python OpenAI SDK environment):
```bash
export OPENAI_API_KEY=dummy
export OPENAI_BASE_URL=http://localhost:8000/v1
```
Example Chat Completions (curl):
```bash
curl "$OPENAI_BASE_URL/chat/completions" \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "mk-llm",
"messages": [
{"role": "system", "content": "Ти си помошник кој зборува на македонски."},
{"role": "user", "content": "Која е историјата на Охрид?"}
],
"stream": true
}'
```