Instructions to use iskiai/iski-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use iskiai/iski-lora with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("unsloth/Qwen3.6-35B-A3B") model = PeftModel.from_pretrained(base_model, "iskiai/iski-lora") - Notebooks
- Google Colab
- Kaggle
iski-lora
LoRA adapter trained on İSKİ (İstanbul Su ve Kanalizasyon İdaresi) corporate Q&A data for domain-specific Turkish language understanding.
Model Information
| Field | Value |
|---|---|
| Model type | PEFT LoRA Adapter |
| Base model | unsloth/Qwen3.6-35B-A3B |
| Framework | Transformers + PEFT (Unsloth) |
| Precision | bfloat16 |
| LoRA rank | r=32, alpha=64 |
| LoRA target modules | q_proj, k_proj, v_proj, o_proj |
| LoRA dropout | 0 |
| Training samples | ~7,100 İSKİ Q&A pairs |
| Language | Turkish |
| Domain | Municipal water & sewerage administration |
Training Configuration
| Parameter | Value |
|---|---|
| Epochs | 3 |
| Learning rate | 5e-5 |
| LR scheduler | cosine |
| Optimizer | paged_adamw_8bit |
| Batch size (per device) | 8 |
| Gradient accumulation | 4 (effective batch: 32) |
| Weight decay | 0.01 |
| Warmup ratio | 0.03 |
| Max sequence length | 4096 |
| Early stopping patience | 2 |
| Best model metric | eval_loss |
Data Pipeline
Dataset is split using GroupShuffleSplit (group key: source_file) with a 95/5 train/eval ratio to prevent data leakage across documents from the same source.
Each training sample is formatted as a three-turn conversation using the Qwen3 chat template:
System → domain context (category, source_file, source_year)
User → instruction [+ optional input]
Assistant → expected output
Loss is computed only on assistant responses (train_on_responses_only).
Usage
Load the Adapter
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
base_model_id = "unsloth/Qwen3.6-35B-A3B"
adapter_id = "iskiai/iski-lora"
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
base_model = AutoModelForCausalLM.from_pretrained(
base_model_id,
torch_dtype=torch.bfloat16,
device_map="auto"
)
model = PeftModel.from_pretrained(base_model, adapter_id)
model.eval()
Inference
SYSTEM_BASE = "Sen kurum ici dokumanlar konusunda uzman, yardimci bir asistansin."
def build_system_message(category="", source_file="", source_year=""):
return (f"{SYSTEM_BASE} Baglam Bilgisi - Kategori: {category}, "
f"Kaynak: {source_file} ({source_year}).")
messages = [
{"role": "system", "content": build_system_message()},
{"role": "user", "content": "İSKİ su aboneliği nasıl açılır?"},
]
prompt = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
# Qwen3: thinking mode kapatılmalı
chat_template_kwargs={"enable_thinking": False}
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=512,
do_sample=False,
use_cache=True,
)
response = tokenizer.decode(
outputs[0][inputs["input_ids"].shape[1]:],
skip_special_tokens=True
)
print(response)
⚠️ Thinking Mode: Qwen3 modelleri varsayılan olarak chain-of-thought düşünme modunu etkinleştirir. Üretim ortamında
enable_thinking: falseset edilmezse yanıtlar<think>...</think>bloğu içerebilir.
Serving with vLLM
vllm serve unsloth/Qwen3.6-35B-A3B \
--enable-lora \
--lora-modules iski-lora=iskiai/iski-lora \
--max-lora-rank 32 \
--max-model-len 8192 \
--port 5001
from openai import OpenAI
client = OpenAI(base_url="http://localhost:5001/v1", api_key="dummy")
response = client.chat.completions.create(
model="iski-lora",
messages=[
{"role": "system", "content": build_system_message()},
{"role": "user", "content": "Su sayacı arızası nasıl bildirilir?"},
],
extra_body={"chat_template_kwargs": {"enable_thinking": False}}
)
print(response.choices[0].message.content)
Files
iski-lora/
├── adapter_model.safetensors # LoRA weights
├── adapter_config.json # PEFT configuration
├── tokenizer_config.json
└── README.md
Training Outputs
The training script produces the following artifacts alongside the adapter:
| Artifact | Description |
|---|---|
egitim_loglari.jsonl |
Step-by-step training logs (loss, eval_loss, lr, epoch) in JSON Lines format |
benchmark_sonuclari.json |
Final metrics: perplexity, VRAM usage, sample generations |
grafikler/ |
Per-metric PNG plots + combined summary dashboard |
runs/ |
TensorBoard logs (local, no external connection required) |
View training curves locally:
tensorboard --logdir runs
Evaluation
Final model quality is assessed by:
- Perplexity on the held-out eval split (computed from
eval_loss) - Sample generations — 5 randomly drawn eval examples are generated greedily and saved to
benchmark_sonuclari.jsonfor manual inspection - DeepEval GEval — downstream evaluation using a vLLM-backed judge model
License
For internal İSKİ use only. Contact the model owner for licensing inquiries.
Author
Created by İSKİ AI Team · iskiai
- Downloads last month
- 20