Turkish-LLM-7B-Instruct 🇹🇷

The first open-source instruction-tuned Turkish language model at 7B scale.

Model on HF

Highlights

  • 🇹🇷 Native Turkish - Trained specifically for Turkish language tasks
  • 💬 Instruction Following - Optimized for chat and Q&A
  • 🚀 7B Parameters - Balanced performance and efficiency
  • 📖 Open Source - Apache 2.0 License

Model Details

Base Model TURKCELL/Turkcell-LLM-7b-v1
Parameters 7 Billion
Language Turkish (Türkçe)
License Apache 2.0
Training Data 125,000+ Turkish instruction-response pairs
Fine-tuning LoRA (Low-Rank Adaptation)

Training

Parameter Value
Hardware NVIDIA A100 80GB
Training Time ~10 hours
Framework PyTorch + Transformers + PEFT
Precision bfloat16
Final Loss 1.88
Learning Rate 5e-6
Batch Size 16
Max Sequence Length 2048
LoRA Rank 64
LoRA Alpha 128

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load model
model = AutoModelForCausalLM.from_pretrained(
    "ogulcanaydogan/turkish-llm-7b-instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("ogulcanaydogan/turkish-llm-7b-instruct")

# Chat format
def chat(message):
    prompt = f"<|im_start|>user\n{message}<|im_end|>\n<|im_start|>assistant\n"
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    outputs = model.generate(
        **inputs,
        max_new_tokens=512,
        temperature=0.7,
        top_p=0.9,
        do_sample=True
    )
    return tokenizer.decode(outputs[0], skip_special_tokens=True).split("assistant\n")[-1]

# Example
print(chat("Türkiye'nin başkenti neresidir?"))

Example Outputs

Question Answering

Soru Cevap
Türkiye'nin başkenti neresidir? Türkiye'nin başkenti Ankara'dır.
Yapay zeka nedir? Yapay zeka, öğrenme ve akıl yürütme yeteneğine sahip bilgisayar sistemlerini ifade eder.

Creative Writing

Prompt: Bana kısa bir hikaye yaz.

Response: Bir varmış bir yokmuş, çok uzak bir diyarda, büyük bir maceraya atılan bir grup arkadaş varmış. Hepsi cesur ve cesurdu ve birbirlerine yardım etmek için her zaman hazırdı...

Code Generation

Prompt: Python'da bir listeyi nasıl sıralarım?

Response: Python'da bir listeyi sıralamanın birkaç yolu vardır. Yaygın yöntemlerden biri sorted() işlevini kullanmaktır...

Use Cases

  • 🤖 Turkish chatbots and virtual assistants
  • ❓ Question answering systems
  • 📝 Text generation and creative writing
  • 📚 Educational applications
  • 🔬 NLP research for Turkish language

Limitations

  • May occasionally generate incorrect information (hallucinations)
  • Code generation sometimes uses Turkish keywords instead of English
  • Performance on very long contexts (>2048 tokens) may degrade
  • Not recommended for production without additional safety measures

Author

Ogulcan Aydogan

Citation

@misc{turkish-llm-7b-instruct-2026,
  author = {Aydogan, Ogulcan},
  title = {Turkish-LLM-7B-Instruct: An Instruction-Tuned Turkish Language Model},
  year = {2026},
  publisher = {HuggingFace},
  url = {https://huggingface.co/ogulcanaydogan/turkish-llm-7b-instruct}
}

Acknowledgments


If you find this model useful, please ⭐ star the repository!

Made with ❤️ in Turkey 🇹🇷

Downloads last month
31
Safetensors
Model size
7B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ogulcanaydogan/turkish-llm-7b-instruct

Finetuned
(5)
this model
Quantizations
2 models

Space using ogulcanaydogan/turkish-llm-7b-instruct 1