Add model card with training details and benchmarks

Browse files

Files changed (1) hide show

README.md +10 -10

README.md CHANGED Viewed

@@ -17,7 +17,7 @@ tags:
   - qwen3.5
 pipeline_tag: text-generation
 model-index:
-  - name: lale-9b-v2
     results:
       - task:
           type: text-generation
@@ -40,7 +40,7 @@ model-index:
             value: 0.376
 ---
-# lale-9b-v2
 **lale** (Turkish for "tulip") is a Turkish instruction-following language model fine-tuned from [Qwen3.5-9B](https://huggingface.co/Qwen/Qwen3.5-9B). It is designed to be the best Turkish language model at its size class, with strong performance in general knowledge, reasoning, tool use, grammar, finance, and legal domains.
@@ -81,9 +81,9 @@ All data was filtered for format validity, length bounds, exact deduplication, a
 Evaluated using the [terazi](https://github.com/selimozten/terazi) Turkish language model benchmark suite.
-### v1 vs v2 Comparison
-| Category | v1 (98K data) | v2 (118K data) | Change |
 |---|---|---|---|
 | **core** | 0.511 | **0.516** | +1.0% |
 | common_sense | 0.970 | **0.980** | +1.0% |
@@ -138,13 +138,13 @@ ollama run lale
 from transformers import AutoModelForCausalLM, AutoTokenizer
 model = AutoModelForCausalLM.from_pretrained(
-    "comarproject/lale-9b-v2",
     subfolder="merged",
     torch_dtype="bfloat16",
     device_map="auto",
 )
 tokenizer = AutoTokenizer.from_pretrained(
-    "comarproject/lale-9b-v2",
     subfolder="merged",
 )
@@ -169,7 +169,7 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 - Trained primarily on synthetic data from Claude models; may reflect Claude's style and biases
 - Context window limited to 2048 tokens during training (base model supports 128K)
-- Sentiment analysis regressed from v1 (-20%) -- may need targeted data for this subcategory
 - Some long legal/financial prompts may exceed the trained context length
 ## License
@@ -179,10 +179,10 @@ Apache 2.0
 ## Citation
 ```bibtex
-@misc{lale-9b-v2,
-  title={lale-9b-v2: Turkish Instruction Model Distilled from Frontier Models},
   author={Selim Ozten},
   year={2026},
-  url={https://huggingface.co/comarproject/lale-9b-v2}
 }
 ```

   - qwen3.5
 pipeline_tag: text-generation
 model-index:
+  - name: lale-9b-2603
     results:
       - task:
           type: text-generation
             value: 0.376
 ---
+# lale-9b-2603
 **lale** (Turkish for "tulip") is a Turkish instruction-following language model fine-tuned from [Qwen3.5-9B](https://huggingface.co/Qwen/Qwen3.5-9B). It is designed to be the best Turkish language model at its size class, with strong performance in general knowledge, reasoning, tool use, grammar, finance, and legal domains.
 Evaluated using the [terazi](https://github.com/selimozten/terazi) Turkish language model benchmark suite.
+### lale-9b-2602 vs lale-9b-2603
+| Category | 2602 (98K data) | 2603 (118K data) | Change |
 |---|---|---|---|
 | **core** | 0.511 | **0.516** | +1.0% |
 | common_sense | 0.970 | **0.980** | +1.0% |
 from transformers import AutoModelForCausalLM, AutoTokenizer
 model = AutoModelForCausalLM.from_pretrained(
+    "comarproject/lale-9b-2603",
     subfolder="merged",
     torch_dtype="bfloat16",
     device_map="auto",
 )
 tokenizer = AutoTokenizer.from_pretrained(
+    "comarproject/lale-9b-2603",
     subfolder="merged",
 )
 - Trained primarily on synthetic data from Claude models; may reflect Claude's style and biases
 - Context window limited to 2048 tokens during training (base model supports 128K)
+- Sentiment analysis regressed from 2602 (-20%) -- may need targeted data for this subcategory
 - Some long legal/financial prompts may exceed the trained context length
 ## License
 ## Citation
 ```bibtex
+@misc{lale-9b-2603,
+  title={lale-9b-2603: Turkish Instruction Model Distilled from Frontier Models},
   author={Selim Ozten},
   year={2026},
+  url={https://huggingface.co/comarproject/lale-9b-2603}
 }
 ```