Karga-2B-Thinking 🐦‍⬛

Turkish SLM with Chain-of-Thought reasoning · 2 billion parameters · Edge-friendly

A fine-tune of Kumru-2B that thinks out loud before answering.

Karga-2B-Thinking is an advanced fine-tune of the vngrs-ai/Kumru-2B base model. Just as crows (Karga) are known for their exceptional problem-solving skills and tool use, this model has been explicitly engineered to bring Chain-of-Thought (CoT) reasoning capabilities to a 2-Billion parameter Small Language Model (SLM) for the Turkish language.

By generating <think> ... </think> block before answering, the model significantly reduces hallucinations and logically plans its outputs, making it highly effective for mathematics, logic puzzles, and code generation on Edge devices.

⚠️ Academic Pre-Publication Notice This model serves as the official checkpoint for an ongoing academic research project. While the model weights are fully open-source (Apache 2.0), the proprietary synthetic dataset and the novel "Deterministic Tensor Injection Agent" training/inference architecture are temporarily withheld pending double-blind peer review. Full resources will be released upon publication.

🚀 Model Details

Architecture: MistralForCausalLM (Kumru-2B)
Task: Causal Language Modeling with CoT
Parameters: 2 Billion (Optimized for Edge AI)
Language: Turkish
License: Apache 2.0 (Commercially friendly)

📊 Performance & Training

Large Language Models often struggle with complex logic in low-resource languages. To overcome this, the model was trained using QLoRA/Unsloth on a highly robust, custom-translated synthetic dataset generated via vLLM pipelines.

On a strictly unseen benchmark of 654 complex questions, the fine-tuning process yielded massive improvements:

Mathematics: 10x Performance Increase (0.49% ➔ 4.93%)
Python Coding: +11.47% Boost (86.89% ➔ 98.36%)
Overall Average: Increased from 21.71% to 24.31%

💻 Quick Usage

You can easily integrate this model into your Python projects. The model uses a specific chat template and outputs its reasoning inside <think> tags.

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "ilkayO/Karga-2B-Thinking"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id, 
    device_map="auto", 
    torch_dtype=torch.bfloat16
)

prompt = "Aylin'in yaşı, Burak'ın yaşının iki katıdır. Burak 12 yaşında ise, ikisinin yaşları toplamı kaçtır?"
messages = [
    {"role": "system", "content": "Adın Karga. Soruları mantıklı ve adım adım düşünerek yanıtla."},
    {"role": "user", "content": prompt}
]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)

with torch.no_grad():
    outputs = model.generate(
        inputs, 
        max_new_tokens=1024, 
        temperature=0.6, 
        top_p=0.9, 
        repetition_penalty=1.1
    )

response = tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True)
print(response)

🤝 Commercial Integration & Consulting

This model is open-sourced under the Apache 2.0 license, meaning you are free to use, modify, and integrate it into your commercial products.

If your company is looking to integrate advanced NLP systems, build Agentic AI workflows, deploy Edge AI models, or if you are interested in having me join your AI team, feel free to reach out!

📧 Contact: ilkayonay2001@gmail.com | LinkedIn

Downloads last month: 801

Safetensors

Model size

2B params

Tensor type

BF16

Model tree for ilkayO/Karga-2B-Thinking

Base model

vngrs-ai/Kumru-2B

Finetuned

(8)

this model