| --- |
| base_model: vngrs-ai/Kumru-2B-Base |
| library_name: peft |
| license: apache-2.0 |
| tags: |
| - lora |
| - causal-lm |
| - mistral |
| - turkish |
| - kumru |
| datasets: |
| - vngrs-ai/vngrs-web-corpus |
| language: |
| - tr |
| pipeline_tag: text-generation |
| --- |
| |
| # Kumru-2B LoRA Adapter |
|
|
| This repository provides a **LoRA** adapter distilled from the **VNGRS Kumru-2B** model ( |
| `vngrs-ai/Kumru-2B`, the SFT/chat variant) to be applied on top of the base model |
| `vngrs-ai/Kumru-2B-Base`. The goal is to transfer Kumru’s chat/instruction behavior |
| to `Kumru-2B-Base` deployments with a lightweight file footprint. |
|
|
| ## Model Summary |
|
|
| - **Base model:** `vngrs-ai/Kumru-2B-Base` |
| - **Source (target behavior) model:** `vngrs-ai/Kumru-2B` (SFT/chat) |
| - **echnique:** Low-Rank Adaptation (LoRA) |
| - **LoRA rank / alpha:** 768 / 1024 _(update these if you produce a different build)_ |
| - **Layer coverage:** All self-attention and MLP projections |
| - **Output artifacts:** PEFT-compatible `adapter_config.json` + `adapter_model.safetensors` |
| - **License:** Apache 2.0 (aligned with VNGRS Kumru model licensing) |
|
|
| ## Usage |
|
|
| ```python |
| from peft import PeftModel |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| |
| base = "vngrs-ai/Kumru-2B-Base" |
| adapter = "ceofast/kumru-2b-lora" |
| |
| tokenizer = AutoTokenizer.from_pretrained(base) |
| model = AutoModelForCausalLM.from_pretrained(base, torch_dtype="auto", device_map="auto") |
| model = PeftModel.from_pretrained(model, adapter, device_map="auto") |
| |
| messages = [ |
| {"role": "system", "content": "Adın Kumru, Türkçe konuşan yardımcı bir modelsin."}, |
| {"role": "user", "content": "İstanbul'un fethi hakkında kısa bilgi verir misin?"} |
| ] |
| inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device) |
| outputs = model.generate(inputs, max_new_tokens=512, temperature=0.7, top_p=0.9) |
| print(tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True)) |
| ``` |
| > Not: This adapter must be used together with `vngrs-ai/Kumru-2B-Base`. |
|
|
| ## Extraction Process |
|
|
| The adapter is obtained by computing the delta between the base and the SFT checkpoints and factorizing it with **SVD** |
| into low-rank components. In this release, the measured reconstruction error is approximately **0.409**. To better preserve |
| quality, you may increase rank/alpha and export a new version (e.g., rank **1024** / alpha **2048**). A lower-error build will |
| be added as soon as possible. |
|
|
| - Script: export_kumru.py |
| |
| ## Known Limitations |
| |
| - Kumru-2B is still a ~2B-parameter model; it may struggle with very long context, rare technical terms, and complex math. |
| - With low ranks, SVD-based LoRA can be less stable than the original SFT checkpoint. |
| - Training data is based on VNGRS’s public Turkish corpus cleaning pipeline; truthfulness/hallucination issues may still occur. |
| |
| ### Framework versions |
| |
| - PEFT 0.11.1 |