Turkish-LLM-7B-Instruct

The first open-source instruction-tuned Turkish language model at 7B scale.

Demo GitHub 14B


Highlights

  • Native Turkish - Trained specifically for Turkish language tasks
  • Instruction Following - Optimized for chat and Q&A
  • 7B Parameters - Balanced performance and efficiency
  • Open Source - Apache 2.0 License

Model Details

Attribute Value
Developer Ogulcan Aydogan
Base Model TURKCELL/Turkcell-LLM-7b-v1
Parameters 7 Billion
Language Turkish (tr)
License Apache 2.0
Fine-tuning LoRA (Low-Rank Adaptation)
Training Data 125,000+ Turkish instruction-response pairs

Model Family

Model Parameters Base Method Use Case
Turkish-LLM-14B-Instruct 14.7B Qwen2.5-14B-Instruct SFT Higher quality, complex reasoning
Turkish-LLM-14B-Instruct-GGUF 14.7B 14B-Instruct GGUF quantized Local/edge deployment
Turkish-LLM-7B-Instruct (this) 7B Turkcell-LLM-7b-v1 LoRA Lightweight, faster inference

Training

Parameter Value
Hardware NVIDIA A100 80GB
Framework PyTorch + Transformers + PEFT
Precision bfloat16
Final Loss 1.88
Learning Rate 5e-6
Batch Size 16
Max Sequence Length 2048
LoRA Rank 64
LoRA Alpha 128

Usage

Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "ogulcanaydogan/Turkish-LLM-7B-Instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("ogulcanaydogan/Turkish-LLM-7B-Instruct")

messages = [
    {"role": "user", "content": "Turkiye'nin baskenti neresidir?"}
]

prompt = "<|im_start|>user\n" + messages[0]["content"] + "<|im_end|>\n<|im_start|>assistant\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.7,
    top_p=0.9,
    do_sample=True
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True).split("assistant\n")[-1])

Ollama

ollama run hf.co/ogulcanaydogan/Turkish-LLM-7B-Instruct

Chat Template

<|im_start|>user
{user_message}<|im_end|>
<|im_start|>assistant
{assistant_response}<|im_end|>

Example Outputs

Q: Turkiye'nin baskenti neresidir? A: Turkiye'nin baskenti Ankara'dir.

Q: Yapay zeka nedir? A: Yapay zeka, ogrenme ve akil yurutme yetenegine sahip bilgisayar sistemlerini ifade eder.

Hardware Requirements

Precision VRAM Required Recommended
BF16 ~14 GB RTX 4090, A10G, M2 Pro (16GB)
INT8 ~7 GB RTX 3080, M1 Pro
INT4 ~4 GB RTX 3060, Apple M-series (8GB)

Intended Use

  • Turkish chatbots and virtual assistants
  • Question answering systems
  • Text generation and creative writing
  • Educational applications
  • NLP research for Turkish language

Limitations

  • May occasionally generate incorrect information (hallucinations)
  • Performance on very long contexts (>2048 tokens) may degrade
  • Not recommended for production without additional safety measures

Related Resources

Resource Link
14B Model Turkish-LLM-14B-Instruct
14B GGUF Turkish-LLM-14B-Instruct-GGUF
Live Demo (14B) Turkish-LLM-14B-Chat
Live Demo (7B) Turkish-LLM-7B-Chat
Training Pipeline LowResource-LLM-Forge
Project Repository Turkish-LLM on GitHub

Citation

@misc{turkish-llm-7b-instruct-2026,
  author = {Aydogan, Ogulcan},
  title = {Turkish-LLM-7B-Instruct: An Instruction-Tuned Turkish Language Model},
  year = {2026},
  publisher = {HuggingFace},
  url = {https://huggingface.co/ogulcanaydogan/Turkish-LLM-7B-Instruct}
}

Contact

Acknowledgments

Downloads last month
77
Safetensors
Model size
7B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ogulcanaydogan/Turkish-LLM-7B-Instruct

Finetuned
(5)
this model
Quantizations
2 models

Space using ogulcanaydogan/Turkish-LLM-7B-Instruct 1

Collection including ogulcanaydogan/Turkish-LLM-7B-Instruct