iski-lora

LoRA adapter trained on İSKİ (İstanbul Su ve Kanalizasyon İdaresi) corporate Q&A data for domain-specific Turkish language understanding.

Model Information

Field Value
Model type PEFT LoRA Adapter
Base model unsloth/Qwen3.6-35B-A3B
Framework Transformers + PEFT (Unsloth)
Precision bfloat16
LoRA rank r=32, alpha=64
LoRA target modules q_proj, k_proj, v_proj, o_proj
LoRA dropout 0
Training samples ~7,100 İSKİ Q&A pairs
Language Turkish
Domain Municipal water & sewerage administration

Training Configuration

Parameter Value
Epochs 3
Learning rate 5e-5
LR scheduler cosine
Optimizer paged_adamw_8bit
Batch size (per device) 8
Gradient accumulation 4 (effective batch: 32)
Weight decay 0.01
Warmup ratio 0.03
Max sequence length 4096
Early stopping patience 2
Best model metric eval_loss

Data Pipeline

Dataset is split using GroupShuffleSplit (group key: source_file) with a 95/5 train/eval ratio to prevent data leakage across documents from the same source.

Each training sample is formatted as a three-turn conversation using the Qwen3 chat template:

System    → domain context (category, source_file, source_year)
User      → instruction [+ optional input]
Assistant → expected output

Loss is computed only on assistant responses (train_on_responses_only).

Usage

Load the Adapter

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

base_model_id = "unsloth/Qwen3.6-35B-A3B"
adapter_id = "iskiai/iski-lora"

tokenizer = AutoTokenizer.from_pretrained(base_model_id)
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

model = PeftModel.from_pretrained(base_model, adapter_id)
model.eval()

Inference

SYSTEM_BASE = "Sen kurum ici dokumanlar konusunda uzman, yardimci bir asistansin."

def build_system_message(category="", source_file="", source_year=""):
    return (f"{SYSTEM_BASE} Baglam Bilgisi - Kategori: {category}, "
            f"Kaynak: {source_file} ({source_year}).")

messages = [
    {"role": "system", "content": build_system_message()},
    {"role": "user",   "content": "İSKİ su aboneliği nasıl açılır?"},
]

prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    # Qwen3: thinking mode kapatılmalı
    chat_template_kwargs={"enable_thinking": False}
)

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=512,
        do_sample=False,
        use_cache=True,
    )

response = tokenizer.decode(
    outputs[0][inputs["input_ids"].shape[1]:],
    skip_special_tokens=True
)
print(response)

⚠️ Thinking Mode: Qwen3 modelleri varsayılan olarak chain-of-thought düşünme modunu etkinleştirir. Üretim ortamında enable_thinking: false set edilmezse yanıtlar <think>...</think> bloğu içerebilir.

Serving with vLLM

vllm serve unsloth/Qwen3.6-35B-A3B \
  --enable-lora \
  --lora-modules iski-lora=iskiai/iski-lora \
  --max-lora-rank 32 \
  --max-model-len 8192 \
  --port 5001
from openai import OpenAI

client = OpenAI(base_url="http://localhost:5001/v1", api_key="dummy")

response = client.chat.completions.create(
    model="iski-lora",
    messages=[
        {"role": "system", "content": build_system_message()},
        {"role": "user",   "content": "Su sayacı arızası nasıl bildirilir?"},
    ],
    extra_body={"chat_template_kwargs": {"enable_thinking": False}}
)
print(response.choices[0].message.content)

Files

iski-lora/
├── adapter_model.safetensors   # LoRA weights
├── adapter_config.json         # PEFT configuration
├── tokenizer_config.json
└── README.md

Training Outputs

The training script produces the following artifacts alongside the adapter:

Artifact Description
egitim_loglari.jsonl Step-by-step training logs (loss, eval_loss, lr, epoch) in JSON Lines format
benchmark_sonuclari.json Final metrics: perplexity, VRAM usage, sample generations
grafikler/ Per-metric PNG plots + combined summary dashboard
runs/ TensorBoard logs (local, no external connection required)

View training curves locally:

tensorboard --logdir runs

Evaluation

Final model quality is assessed by:

  1. Perplexity on the held-out eval split (computed from eval_loss)
  2. Sample generations — 5 randomly drawn eval examples are generated greedily and saved to benchmark_sonuclari.json for manual inspection
  3. DeepEval GEval — downstream evaluation using a vLLM-backed judge model

License

For internal İSKİ use only. Contact the model owner for licensing inquiries.

Author

Created by İSKİ AI Team · iskiai

Downloads last month
20
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for iskiai/iski-lora

Adapter
(11)
this model