ogulcanaydogan commited on
Commit
4bd2a8b
·
verified ·
1 Parent(s): 40b439b

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +16 -16
README.md CHANGED
@@ -13,10 +13,10 @@ tags:
13
  - low-resource
14
  - nlp
15
  datasets:
16
- - ogulcanaydogan/turkish-llm-v10-training
17
  pipeline_tag: text-generation
18
  model-index:
19
- - name: turkish-llm-14b-instruct
20
  results: []
21
  ---
22
 
@@ -25,10 +25,10 @@ model-index:
25
  An open-source 14.7 billion parameter language model fine-tuned for native Turkish instruction following. Built on Qwen2.5-14B-Instruct using supervised fine-tuning (SFT) on a curated corpus of Turkish-language examples spanning science, history, geography, and general knowledge.
26
 
27
  <p align="center">
28
- <a href="https://huggingface.co/spaces/ogulcanaydogan/turkish-llm-14b-chat"><img src="https://img.shields.io/badge/Demo-Live_Chat-blue?style=for-the-badge&logo=huggingface" alt="Demo"></a>
29
  <a href="https://github.com/ogulcanaydogan/Turkish-LLM"><img src="https://img.shields.io/badge/GitHub-Repository-black?style=for-the-badge&logo=github" alt="GitHub"></a>
30
- <a href="https://huggingface.co/datasets/ogulcanaydogan/turkish-llm-v10-training"><img src="https://img.shields.io/badge/Dataset-144K_samples-green?style=for-the-badge&logo=huggingface" alt="Dataset"></a>
31
- <a href="https://huggingface.co/ogulcanaydogan/turkish-llm-7b-instruct"><img src="https://img.shields.io/badge/Also_Available-7B_Model-yellow?style=for-the-badge&logo=huggingface" alt="7B"></a>
32
  </p>
33
 
34
  ---
@@ -65,14 +65,14 @@ This model is part of the **Turkish-LLM** family:
65
 
66
  | Model | Parameters | Base | Method | Use Case |
67
  |-------|-----------|------|--------|----------|
68
- | **turkish-llm-14b-instruct** (this) | 14.7B | Qwen2.5-14B-Instruct | SFT | Higher quality, complex reasoning |
69
- | [turkish-llm-7b-instruct](https://huggingface.co/ogulcanaydogan/turkish-llm-7b-instruct) | 7B | Turkcell-LLM-7b-v1 | LoRA | Lightweight, faster inference |
70
 
71
  ## Training
72
 
73
  ### Dataset
74
 
75
- Training data was sourced from the [turkish-llm-v10-training](https://huggingface.co/datasets/ogulcanaydogan/turkish-llm-v10-training) dataset — a curated collection of **144,000 Turkish instruction-response pairs** — with a focused SFT subset of approximately 2,600 high-quality examples selected for alignment.
76
 
77
  | Domain | Examples | Purpose |
78
  |--------|----------|---------|
@@ -120,7 +120,7 @@ Raw Turkish Data ──▶ Preprocessing ──▶ SFT Training ──▶ Evalua
120
  from transformers import AutoModelForCausalLM, AutoTokenizer
121
  import torch
122
 
123
- model_id = "ogulcanaydogan/turkish-llm-14b-instruct"
124
  tokenizer = AutoTokenizer.from_pretrained(model_id)
125
  model = AutoModelForCausalLM.from_pretrained(
126
  model_id,
@@ -150,7 +150,7 @@ print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_t
150
 
151
  ```bash
152
  pip install vllm
153
- vllm serve ogulcanaydogan/turkish-llm-14b-instruct \
154
  --dtype float16 \
155
  --max-model-len 4096
156
  ```
@@ -158,7 +158,7 @@ vllm serve ogulcanaydogan/turkish-llm-14b-instruct \
158
  ### Ollama (Local)
159
 
160
  ```bash
161
- ollama run hf.co/ogulcanaydogan/turkish-llm-14b-instruct
162
  ```
163
 
164
  ### Chat Template
@@ -218,10 +218,10 @@ This model is released under Apache 2.0 to support open research and development
218
 
219
  | Resource | Link |
220
  |----------|------|
221
- | 7B Model | [ogulcanaydogan/turkish-llm-7b-instruct](https://huggingface.co/ogulcanaydogan/turkish-llm-7b-instruct) |
222
- | Training Dataset (144K) | [ogulcanaydogan/turkish-llm-v10-training](https://huggingface.co/datasets/ogulcanaydogan/turkish-llm-v10-training) |
223
- | Live Demo (14B) | [turkish-llm-14b-chat](https://huggingface.co/spaces/ogulcanaydogan/turkish-llm-14b-chat) |
224
- | Live Demo (7B) | [turkish-llm-7b-chat](https://huggingface.co/spaces/ogulcanaydogan/turkish-llm-7b-chat) |
225
  | Training Pipeline | [LowResource-LLM-Forge](https://github.com/ogulcanaydogan/LowResource-LLM-Forge) |
226
  | Project Repository | [Turkish-LLM on GitHub](https://github.com/ogulcanaydogan/Turkish-LLM) |
227
 
@@ -233,7 +233,7 @@ This model is released under Apache 2.0 to support open research and development
233
  author = {Aydogan, Ogulcan},
234
  year = {2026},
235
  publisher = {Hugging Face},
236
- url = {https://huggingface.co/ogulcanaydogan/turkish-llm-14b-instruct}
237
  }
238
  ```
239
 
 
13
  - low-resource
14
  - nlp
15
  datasets:
16
+ - ogulcanaydogan/Turkish-LLM-v10-Training
17
  pipeline_tag: text-generation
18
  model-index:
19
+ - name: Turkish-LLM-14B-Instruct
20
  results: []
21
  ---
22
 
 
25
  An open-source 14.7 billion parameter language model fine-tuned for native Turkish instruction following. Built on Qwen2.5-14B-Instruct using supervised fine-tuning (SFT) on a curated corpus of Turkish-language examples spanning science, history, geography, and general knowledge.
26
 
27
  <p align="center">
28
+ <a href="https://huggingface.co/spaces/ogulcanaydogan/Turkish-LLM-14B-Chat"><img src="https://img.shields.io/badge/Demo-Live_Chat-blue?style=for-the-badge&logo=huggingface" alt="Demo"></a>
29
  <a href="https://github.com/ogulcanaydogan/Turkish-LLM"><img src="https://img.shields.io/badge/GitHub-Repository-black?style=for-the-badge&logo=github" alt="GitHub"></a>
30
+ <a href="https://huggingface.co/datasets/ogulcanaydogan/Turkish-LLM-v10-Training"><img src="https://img.shields.io/badge/Dataset-144K_samples-green?style=for-the-badge&logo=huggingface" alt="Dataset"></a>
31
+ <a href="https://huggingface.co/ogulcanaydogan/Turkish-LLM-7B-Instruct"><img src="https://img.shields.io/badge/Also_Available-7B_Model-yellow?style=for-the-badge&logo=huggingface" alt="7B"></a>
32
  </p>
33
 
34
  ---
 
65
 
66
  | Model | Parameters | Base | Method | Use Case |
67
  |-------|-----------|------|--------|----------|
68
+ | **Turkish-LLM-14B-Instruct** (this) | 14.7B | Qwen2.5-14B-Instruct | SFT | Higher quality, complex reasoning |
69
+ | [Turkish-LLM-7B-Instruct](https://huggingface.co/ogulcanaydogan/Turkish-LLM-7B-Instruct) | 7B | Turkcell-LLM-7b-v1 | LoRA | Lightweight, faster inference |
70
 
71
  ## Training
72
 
73
  ### Dataset
74
 
75
+ Training data was sourced from the [Turkish-LLM-v10-Training](https://huggingface.co/datasets/ogulcanaydogan/Turkish-LLM-v10-Training) dataset — a curated collection of **144,000 Turkish instruction-response pairs** — with a focused SFT subset of approximately 2,600 high-quality examples selected for alignment.
76
 
77
  | Domain | Examples | Purpose |
78
  |--------|----------|---------|
 
120
  from transformers import AutoModelForCausalLM, AutoTokenizer
121
  import torch
122
 
123
+ model_id = "ogulcanaydogan/Turkish-LLM-14B-Instruct"
124
  tokenizer = AutoTokenizer.from_pretrained(model_id)
125
  model = AutoModelForCausalLM.from_pretrained(
126
  model_id,
 
150
 
151
  ```bash
152
  pip install vllm
153
+ vllm serve ogulcanaydogan/Turkish-LLM-14B-Instruct \
154
  --dtype float16 \
155
  --max-model-len 4096
156
  ```
 
158
  ### Ollama (Local)
159
 
160
  ```bash
161
+ ollama run hf.co/ogulcanaydogan/Turkish-LLM-14B-Instruct
162
  ```
163
 
164
  ### Chat Template
 
218
 
219
  | Resource | Link |
220
  |----------|------|
221
+ | 7B Model | [Turkish-LLM-7B-Instruct](https://huggingface.co/ogulcanaydogan/Turkish-LLM-7B-Instruct) |
222
+ | Training Dataset (144K) | [Turkish-LLM-v10-Training](https://huggingface.co/datasets/ogulcanaydogan/Turkish-LLM-v10-Training) |
223
+ | Live Demo (14B) | [Turkish-LLM-14B-Chat](https://huggingface.co/spaces/ogulcanaydogan/Turkish-LLM-14B-Chat) |
224
+ | Live Demo (7B) | [Turkish-LLM-7B-Chat](https://huggingface.co/spaces/ogulcanaydogan/Turkish-LLM-7B-Chat) |
225
  | Training Pipeline | [LowResource-LLM-Forge](https://github.com/ogulcanaydogan/LowResource-LLM-Forge) |
226
  | Project Repository | [Turkish-LLM on GitHub](https://github.com/ogulcanaydogan/Turkish-LLM) |
227
 
 
233
  author = {Aydogan, Ogulcan},
234
  year = {2026},
235
  publisher = {Hugging Face},
236
+ url = {https://huggingface.co/ogulcanaydogan/Turkish-LLM-14B-Instruct}
237
  }
238
  ```
239