ceofast
/

kumru-2b-lora

Text Generation

Model card Files Files and versions

kumru-2b-lora / README.md

ceofast's picture

Update README.md

c2fc773 verified 6 months ago

|

history blame contribute delete

2.93 kB

	---
	base_model: vngrs-ai/Kumru-2B-Base
	library_name: peft
	license: apache-2.0
	tags:
	- lora
	- causal-lm
	- mistral
	- turkish
	- kumru
	datasets:
	- vngrs-ai/vngrs-web-corpus
	language:
	- tr
	pipeline_tag: text-generation
	---

	# Kumru-2B LoRA Adapter

	This repository provides a LoRA adapter distilled from the VNGRS Kumru-2B model (
	`vngrs-ai/Kumru-2B`, the SFT/chat variant) to be applied on top of the base model
	`vngrs-ai/Kumru-2B-Base`. The goal is to transfer Kumru’s chat/instruction behavior
	to `Kumru-2B-Base` deployments with a lightweight file footprint.

	## Model Summary

	- Base model: `vngrs-ai/Kumru-2B-Base`
	- Source (target behavior) model: `vngrs-ai/Kumru-2B` (SFT/chat)
	- echnique: Low-Rank Adaptation (LoRA)
	- LoRA rank / alpha: 768 / 1024 _(update these if you produce a different build)_
	- Layer coverage: All self-attention and MLP projections
	- Output artifacts: PEFT-compatible `adapter_config.json` + `adapter_model.safetensors`
	- License: Apache 2.0 (aligned with VNGRS Kumru model licensing)

	## Usage

	```python
	from peft import PeftModel
	from transformers import AutoModelForCausalLM, AutoTokenizer

	base = "vngrs-ai/Kumru-2B-Base"
	adapter = "ceofast/kumru-2b-lora"

	tokenizer = AutoTokenizer.from_pretrained(base)
	model = AutoModelForCausalLM.from_pretrained(base, torch_dtype="auto", device_map="auto")
	model = PeftModel.from_pretrained(model, adapter, device_map="auto")

	messages = [
	{"role": "system", "content": "Adın Kumru, Türkçe konuşan yardımcı bir modelsin."},
	{"role": "user", "content": "İstanbul'un fethi hakkında kısa bilgi verir misin?"}
	]
	inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)
	outputs = model.generate(inputs, max_new_tokens=512, temperature=0.7, top_p=0.9)
	print(tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True))
	```
	> Not: This adapter must be used together with `vngrs-ai/Kumru-2B-Base`.

	## Extraction Process

	The adapter is obtained by computing the delta between the base and the SFT checkpoints and factorizing it with SVD
	into low-rank components. In this release, the measured reconstruction error is approximately 0.409. To better preserve
	quality, you may increase rank/alpha and export a new version (e.g., rank 1024 / alpha 2048). A lower-error build will
	be added as soon as possible.

	- Script: export_kumru.py

	## Known Limitations

	- Kumru-2B is still a ~2B-parameter model; it may struggle with very long context, rare technical terms, and complex math.
	- With low ranks, SVD-based LoRA can be less stable than the original SFT checkpoint.
	- Training data is based on VNGRS’s public Turkish corpus cleaning pipeline; truthfulness/hallucination issues may still occur.

	### Framework versions

	- PEFT 0.11.1