Add model card with training details and benchmarks
Browse files
README.md
CHANGED
|
@@ -17,7 +17,7 @@ tags:
|
|
| 17 |
- qwen3.5
|
| 18 |
pipeline_tag: text-generation
|
| 19 |
model-index:
|
| 20 |
-
- name: lale-9b-
|
| 21 |
results:
|
| 22 |
- task:
|
| 23 |
type: text-generation
|
|
@@ -40,7 +40,7 @@ model-index:
|
|
| 40 |
value: 0.376
|
| 41 |
---
|
| 42 |
|
| 43 |
-
# lale-9b-
|
| 44 |
|
| 45 |
**lale** (Turkish for "tulip") is a Turkish instruction-following language model fine-tuned from [Qwen3.5-9B](https://huggingface.co/Qwen/Qwen3.5-9B). It is designed to be the best Turkish language model at its size class, with strong performance in general knowledge, reasoning, tool use, grammar, finance, and legal domains.
|
| 46 |
|
|
@@ -81,9 +81,9 @@ All data was filtered for format validity, length bounds, exact deduplication, a
|
|
| 81 |
|
| 82 |
Evaluated using the [terazi](https://github.com/selimozten/terazi) Turkish language model benchmark suite.
|
| 83 |
|
| 84 |
-
###
|
| 85 |
|
| 86 |
-
| Category |
|
| 87 |
|---|---|---|---|
|
| 88 |
| **core** | 0.511 | **0.516** | +1.0% |
|
| 89 |
| common_sense | 0.970 | **0.980** | +1.0% |
|
|
@@ -138,13 +138,13 @@ ollama run lale
|
|
| 138 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 139 |
|
| 140 |
model = AutoModelForCausalLM.from_pretrained(
|
| 141 |
-
"comarproject/lale-9b-
|
| 142 |
subfolder="merged",
|
| 143 |
torch_dtype="bfloat16",
|
| 144 |
device_map="auto",
|
| 145 |
)
|
| 146 |
tokenizer = AutoTokenizer.from_pretrained(
|
| 147 |
-
"comarproject/lale-9b-
|
| 148 |
subfolder="merged",
|
| 149 |
)
|
| 150 |
|
|
@@ -169,7 +169,7 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
|
| 169 |
|
| 170 |
- Trained primarily on synthetic data from Claude models; may reflect Claude's style and biases
|
| 171 |
- Context window limited to 2048 tokens during training (base model supports 128K)
|
| 172 |
-
- Sentiment analysis regressed from
|
| 173 |
- Some long legal/financial prompts may exceed the trained context length
|
| 174 |
|
| 175 |
## License
|
|
@@ -179,10 +179,10 @@ Apache 2.0
|
|
| 179 |
## Citation
|
| 180 |
|
| 181 |
```bibtex
|
| 182 |
-
@misc{lale-9b-
|
| 183 |
-
title={lale-9b-
|
| 184 |
author={Selim Ozten},
|
| 185 |
year={2026},
|
| 186 |
-
url={https://huggingface.co/comarproject/lale-9b-
|
| 187 |
}
|
| 188 |
```
|
|
|
|
| 17 |
- qwen3.5
|
| 18 |
pipeline_tag: text-generation
|
| 19 |
model-index:
|
| 20 |
+
- name: lale-9b-2603
|
| 21 |
results:
|
| 22 |
- task:
|
| 23 |
type: text-generation
|
|
|
|
| 40 |
value: 0.376
|
| 41 |
---
|
| 42 |
|
| 43 |
+
# lale-9b-2603
|
| 44 |
|
| 45 |
**lale** (Turkish for "tulip") is a Turkish instruction-following language model fine-tuned from [Qwen3.5-9B](https://huggingface.co/Qwen/Qwen3.5-9B). It is designed to be the best Turkish language model at its size class, with strong performance in general knowledge, reasoning, tool use, grammar, finance, and legal domains.
|
| 46 |
|
|
|
|
| 81 |
|
| 82 |
Evaluated using the [terazi](https://github.com/selimozten/terazi) Turkish language model benchmark suite.
|
| 83 |
|
| 84 |
+
### lale-9b-2602 vs lale-9b-2603
|
| 85 |
|
| 86 |
+
| Category | 2602 (98K data) | 2603 (118K data) | Change |
|
| 87 |
|---|---|---|---|
|
| 88 |
| **core** | 0.511 | **0.516** | +1.0% |
|
| 89 |
| common_sense | 0.970 | **0.980** | +1.0% |
|
|
|
|
| 138 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 139 |
|
| 140 |
model = AutoModelForCausalLM.from_pretrained(
|
| 141 |
+
"comarproject/lale-9b-2603",
|
| 142 |
subfolder="merged",
|
| 143 |
torch_dtype="bfloat16",
|
| 144 |
device_map="auto",
|
| 145 |
)
|
| 146 |
tokenizer = AutoTokenizer.from_pretrained(
|
| 147 |
+
"comarproject/lale-9b-2603",
|
| 148 |
subfolder="merged",
|
| 149 |
)
|
| 150 |
|
|
|
|
| 169 |
|
| 170 |
- Trained primarily on synthetic data from Claude models; may reflect Claude's style and biases
|
| 171 |
- Context window limited to 2048 tokens during training (base model supports 128K)
|
| 172 |
+
- Sentiment analysis regressed from 2602 (-20%) -- may need targeted data for this subcategory
|
| 173 |
- Some long legal/financial prompts may exceed the trained context length
|
| 174 |
|
| 175 |
## License
|
|
|
|
| 179 |
## Citation
|
| 180 |
|
| 181 |
```bibtex
|
| 182 |
+
@misc{lale-9b-2603,
|
| 183 |
+
title={lale-9b-2603: Turkish Instruction Model Distilled from Frontier Models},
|
| 184 |
author={Selim Ozten},
|
| 185 |
year={2026},
|
| 186 |
+
url={https://huggingface.co/comarproject/lale-9b-2603}
|
| 187 |
}
|
| 188 |
```
|