Turkish-LLM-7B-Instruct 🇹🇷

The first open-source instruction-tuned Turkish language model at 7B scale.

Highlights

🇹🇷 Native Turkish - Trained specifically for Turkish language tasks
💬 Instruction Following - Optimized for chat and Q&A
🚀 7B Parameters - Balanced performance and efficiency
📖 Open Source - Apache 2.0 License

Model Details


Base Model	TURKCELL/Turkcell-LLM-7b-v1
Parameters	7 Billion
Language	Turkish (Türkçe)
License	Apache 2.0
Training Data	125,000+ Turkish instruction-response pairs
Fine-tuning	LoRA (Low-Rank Adaptation)

Training

Parameter	Value
Hardware	NVIDIA A100 80GB
Training Time	~10 hours
Framework	PyTorch + Transformers + PEFT
Precision	bfloat16
Final Loss	1.88
Learning Rate	5e-6
Batch Size	16
Max Sequence Length	2048
LoRA Rank	64
LoRA Alpha	128

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load model
model = AutoModelForCausalLM.from_pretrained(
    "ogulcanaydogan/turkish-llm-7b-instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("ogulcanaydogan/turkish-llm-7b-instruct")

# Chat format
def chat(message):
    prompt = f"<|im_start|>user\n{message}<|im_end|>\n<|im_start|>assistant\n"
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    outputs = model.generate(
        **inputs,
        max_new_tokens=512,
        temperature=0.7,
        top_p=0.9,
        do_sample=True
    )
    return tokenizer.decode(outputs[0], skip_special_tokens=True).split("assistant\n")[-1]

# Example
print(chat("Türkiye'nin başkenti neresidir?"))

Example Outputs

Question Answering

Soru	Cevap
Türkiye'nin başkenti neresidir?	Türkiye'nin başkenti Ankara'dır.
Yapay zeka nedir?	Yapay zeka, öğrenme ve akıl yürütme yeteneğine sahip bilgisayar sistemlerini ifade eder.

Creative Writing

Prompt: Bana kısa bir hikaye yaz.

Response: Bir varmış bir yokmuş, çok uzak bir diyarda, büyük bir maceraya atılan bir grup arkadaş varmış. Hepsi cesur ve cesurdu ve birbirlerine yardım etmek için her zaman hazırdı...

Code Generation

Prompt: Python'da bir listeyi nasıl sıralarım?

Response: Python'da bir listeyi sıralamanın birkaç yolu vardır. Yaygın yöntemlerden biri sorted() işlevini kullanmaktır...

Use Cases

🤖 Turkish chatbots and virtual assistants
❓ Question answering systems
📝 Text generation and creative writing
📚 Educational applications
🔬 NLP research for Turkish language

Limitations

May occasionally generate incorrect information (hallucinations)
Code generation sometimes uses Turkish keywords instead of English
Performance on very long contexts (>2048 tokens) may degrade
Not recommended for production without additional safety measures

Author

Ogulcan Aydogan


🌐 Website	ogulcanaydogan.com
🐙 GitHub	github.com/ogulcanaydogan
🤗 HuggingFace	huggingface.co/ogulcanaydogan
💼 LinkedIn	linkedin.com/in/ogulcanaydogan

Citation

@misc{turkish-llm-7b-instruct-2026,
  author = {Aydogan, Ogulcan},
  title = {Turkish-LLM-7B-Instruct: An Instruction-Tuned Turkish Language Model},
  year = {2026},
  publisher = {HuggingFace},
  url = {https://huggingface.co/ogulcanaydogan/turkish-llm-7b-instruct}
}

Acknowledgments

Base model by TURKCELL
Training framework: HuggingFace Transformers
Fine-tuning: PEFT

If you find this model useful, please ⭐ star the repository!

Made with ❤️ in Turkey 🇹🇷

Downloads last month: 31

Safetensors

Model size

7B params

Tensor type

BF16

Model tree for ogulcanaydogan/turkish-llm-7b-instruct

Base model

TURKCELL/Turkcell-LLM-7b-v1

Finetuned

(5)

this model

Quantizations

2 models

ogulcanaydogan
/

turkish-llm-7b-instruct