comarproject commited on
Commit
ba72741
·
verified ·
1 Parent(s): eda4955

Add model card with training details and benchmarks

Browse files
Files changed (1) hide show
  1. README.md +10 -10
README.md CHANGED
@@ -17,7 +17,7 @@ tags:
17
  - qwen3.5
18
  pipeline_tag: text-generation
19
  model-index:
20
- - name: lale-9b-v2
21
  results:
22
  - task:
23
  type: text-generation
@@ -40,7 +40,7 @@ model-index:
40
  value: 0.376
41
  ---
42
 
43
- # lale-9b-v2
44
 
45
  **lale** (Turkish for "tulip") is a Turkish instruction-following language model fine-tuned from [Qwen3.5-9B](https://huggingface.co/Qwen/Qwen3.5-9B). It is designed to be the best Turkish language model at its size class, with strong performance in general knowledge, reasoning, tool use, grammar, finance, and legal domains.
46
 
@@ -81,9 +81,9 @@ All data was filtered for format validity, length bounds, exact deduplication, a
81
 
82
  Evaluated using the [terazi](https://github.com/selimozten/terazi) Turkish language model benchmark suite.
83
 
84
- ### v1 vs v2 Comparison
85
 
86
- | Category | v1 (98K data) | v2 (118K data) | Change |
87
  |---|---|---|---|
88
  | **core** | 0.511 | **0.516** | +1.0% |
89
  | common_sense | 0.970 | **0.980** | +1.0% |
@@ -138,13 +138,13 @@ ollama run lale
138
  from transformers import AutoModelForCausalLM, AutoTokenizer
139
 
140
  model = AutoModelForCausalLM.from_pretrained(
141
- "comarproject/lale-9b-v2",
142
  subfolder="merged",
143
  torch_dtype="bfloat16",
144
  device_map="auto",
145
  )
146
  tokenizer = AutoTokenizer.from_pretrained(
147
- "comarproject/lale-9b-v2",
148
  subfolder="merged",
149
  )
150
 
@@ -169,7 +169,7 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
169
 
170
  - Trained primarily on synthetic data from Claude models; may reflect Claude's style and biases
171
  - Context window limited to 2048 tokens during training (base model supports 128K)
172
- - Sentiment analysis regressed from v1 (-20%) -- may need targeted data for this subcategory
173
  - Some long legal/financial prompts may exceed the trained context length
174
 
175
  ## License
@@ -179,10 +179,10 @@ Apache 2.0
179
  ## Citation
180
 
181
  ```bibtex
182
- @misc{lale-9b-v2,
183
- title={lale-9b-v2: Turkish Instruction Model Distilled from Frontier Models},
184
  author={Selim Ozten},
185
  year={2026},
186
- url={https://huggingface.co/comarproject/lale-9b-v2}
187
  }
188
  ```
 
17
  - qwen3.5
18
  pipeline_tag: text-generation
19
  model-index:
20
+ - name: lale-9b-2603
21
  results:
22
  - task:
23
  type: text-generation
 
40
  value: 0.376
41
  ---
42
 
43
+ # lale-9b-2603
44
 
45
  **lale** (Turkish for "tulip") is a Turkish instruction-following language model fine-tuned from [Qwen3.5-9B](https://huggingface.co/Qwen/Qwen3.5-9B). It is designed to be the best Turkish language model at its size class, with strong performance in general knowledge, reasoning, tool use, grammar, finance, and legal domains.
46
 
 
81
 
82
  Evaluated using the [terazi](https://github.com/selimozten/terazi) Turkish language model benchmark suite.
83
 
84
+ ### lale-9b-2602 vs lale-9b-2603
85
 
86
+ | Category | 2602 (98K data) | 2603 (118K data) | Change |
87
  |---|---|---|---|
88
  | **core** | 0.511 | **0.516** | +1.0% |
89
  | common_sense | 0.970 | **0.980** | +1.0% |
 
138
  from transformers import AutoModelForCausalLM, AutoTokenizer
139
 
140
  model = AutoModelForCausalLM.from_pretrained(
141
+ "comarproject/lale-9b-2603",
142
  subfolder="merged",
143
  torch_dtype="bfloat16",
144
  device_map="auto",
145
  )
146
  tokenizer = AutoTokenizer.from_pretrained(
147
+ "comarproject/lale-9b-2603",
148
  subfolder="merged",
149
  )
150
 
 
169
 
170
  - Trained primarily on synthetic data from Claude models; may reflect Claude's style and biases
171
  - Context window limited to 2048 tokens during training (base model supports 128K)
172
+ - Sentiment analysis regressed from 2602 (-20%) -- may need targeted data for this subcategory
173
  - Some long legal/financial prompts may exceed the trained context length
174
 
175
  ## License
 
179
  ## Citation
180
 
181
  ```bibtex
182
+ @misc{lale-9b-2603,
183
+ title={lale-9b-2603: Turkish Instruction Model Distilled from Frontier Models},
184
  author={Selim Ozten},
185
  year={2026},
186
+ url={https://huggingface.co/comarproject/lale-9b-2603}
187
  }
188
  ```