Karga-2B-Thinking 🐦‍⬛

Turkish SLM with Chain-of-Thought reasoning · 2 billion parameters · Edge-friendly

A fine-tune of Kumru-2B that thinks out loud before answering.

Hugging Face Spaces Open In Colab GitHub Apache 2.0

Karga-2B-Thinking is an advanced fine-tune of the vngrs-ai/Kumru-2B base model. Just as crows (Karga) are known for their exceptional problem-solving skills and tool use, this model has been explicitly engineered to bring Chain-of-Thought (CoT) reasoning capabilities to a 2-Billion parameter Small Language Model (SLM) for the Turkish language.

By generating <think> ... </think> block before answering, the model significantly reduces hallucinations and logically plans its outputs, making it highly effective for mathematics, logic puzzles, and code generation on Edge devices.

⚠️ Academic Pre-Publication Notice This model serves as the official checkpoint for an ongoing academic research project. While the model weights are fully open-source (Apache 2.0), the proprietary synthetic dataset and the novel "Deterministic Tensor Injection Agent" training/inference architecture are temporarily withheld pending double-blind peer review. Full resources will be released upon publication.

🚀 Model Details

  • Architecture: MistralForCausalLM (Kumru-2B)
  • Task: Causal Language Modeling with CoT
  • Parameters: 2 Billion (Optimized for Edge AI)
  • Language: Turkish
  • License: Apache 2.0 (Commercially friendly)

📊 Performance & Training

Large Language Models often struggle with complex logic in low-resource languages. To overcome this, the model was trained using QLoRA/Unsloth on a highly robust, custom-translated synthetic dataset generated via vLLM pipelines.

On a strictly unseen benchmark of 654 complex questions, the fine-tuning process yielded massive improvements:

  • Mathematics: 10x Performance Increase (0.49% ➔ 4.93%)
  • Python Coding: +11.47% Boost (86.89% ➔ 98.36%)
  • Overall Average: Increased from 21.71% to 24.31%

💻 Quick Usage

You can easily integrate this model into your Python projects. The model uses a specific chat template and outputs its reasoning inside <think> tags.

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "ilkayO/Karga-2B-Thinking"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id, 
    device_map="auto", 
    torch_dtype=torch.bfloat16
)

prompt = "Aylin'in yaşı, Burak'ın yaşının iki katıdır. Burak 12 yaşında ise, ikisinin yaşları toplamı kaçtır?"
messages = [
    {"role": "system", "content": "Adın Karga. Soruları mantıklı ve adım adım düşünerek yanıtla."},
    {"role": "user", "content": prompt}
]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)

with torch.no_grad():
    outputs = model.generate(
        inputs, 
        max_new_tokens=1024, 
        temperature=0.6, 
        top_p=0.9, 
        repetition_penalty=1.1
    )

response = tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True)
print(response)

🤝 Commercial Integration & Consulting

This model is open-sourced under the Apache 2.0 license, meaning you are free to use, modify, and integrate it into your commercial products.

If your company is looking to integrate advanced NLP systems, build Agentic AI workflows, deploy Edge AI models, or if you are interested in having me join your AI team, feel free to reach out!

📧 Contact: ilkayonay2001@gmail.com | LinkedIn

Downloads last month
801
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ilkayO/Karga-2B-Thinking

Finetuned
(8)
this model

Space using ilkayO/Karga-2B-Thinking 1