--- base_model: vngrs-ai/Kumru-2B-Base library_name: peft license: apache-2.0 tags: - lora - causal-lm - mistral - turkish - kumru datasets: - vngrs-ai/vngrs-web-corpus language: - tr pipeline_tag: text-generation --- # Kumru-2B LoRA Adapter This repository provides a **LoRA** adapter distilled from the **VNGRS Kumru-2B** model ( `vngrs-ai/Kumru-2B`, the SFT/chat variant) to be applied on top of the base model `vngrs-ai/Kumru-2B-Base`. The goal is to transfer Kumru’s chat/instruction behavior to `Kumru-2B-Base` deployments with a lightweight file footprint. ## Model Summary - **Base model:** `vngrs-ai/Kumru-2B-Base` - **Source (target behavior) model:** `vngrs-ai/Kumru-2B` (SFT/chat) - **echnique:** Low-Rank Adaptation (LoRA) - **LoRA rank / alpha:** 768 / 1024 _(update these if you produce a different build)_ - **Layer coverage:** All self-attention and MLP projections - **Output artifacts:** PEFT-compatible `adapter_config.json` + `adapter_model.safetensors` - **License:** Apache 2.0 (aligned with VNGRS Kumru model licensing) ## Usage ```python from peft import PeftModel from transformers import AutoModelForCausalLM, AutoTokenizer base = "vngrs-ai/Kumru-2B-Base" adapter = "ceofast/kumru-2b-lora" tokenizer = AutoTokenizer.from_pretrained(base) model = AutoModelForCausalLM.from_pretrained(base, torch_dtype="auto", device_map="auto") model = PeftModel.from_pretrained(model, adapter, device_map="auto") messages = [ {"role": "system", "content": "Adın Kumru, Türkçe konuşan yardımcı bir modelsin."}, {"role": "user", "content": "İstanbul'un fethi hakkında kısa bilgi verir misin?"} ] inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device) outputs = model.generate(inputs, max_new_tokens=512, temperature=0.7, top_p=0.9) print(tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True)) ``` > Not: This adapter must be used together with `vngrs-ai/Kumru-2B-Base`. ## Extraction Process The adapter is obtained by computing the delta between the base and the SFT checkpoints and factorizing it with **SVD** into low-rank components. In this release, the measured reconstruction error is approximately **0.409**. To better preserve quality, you may increase rank/alpha and export a new version (e.g., rank **1024** / alpha **2048**). A lower-error build will be added as soon as possible. - Script: export_kumru.py ## Known Limitations - Kumru-2B is still a ~2B-parameter model; it may struggle with very long context, rare technical terms, and complex math. - With low ranks, SVD-based LoRA can be less stable than the original SFT checkpoint. - Training data is based on VNGRS’s public Turkish corpus cleaning pipeline; truthfulness/hallucination issues may still occur. ### Framework versions - PEFT 0.11.1