NeuronUz
/

NeuronAI-Uzbek

Text Generation

tokenizer-optimization

Eval Results (legacy)

Model card Files Files and versions

kmamaroziqov commited on Dec 30, 2025

Commit

86b5842

·

verified ·

1 Parent(s): 0322d7f

Update README.md

Files changed (1) hide show

README.md +6 -16

README.md CHANGED Viewed

@@ -5,12 +5,13 @@ language:
 tags:
 - uzbek
 - english
-- sft
 - chat
 - transformers
 pipeline_tag: text-generation
 library_name: transformers
 license: other
 ---
 # NeuronAI-Uzbek
@@ -26,12 +27,12 @@ NeuronAI-Uzbek is a Qwen3-family causal language model fine-tuned to be helpful
 - **Attention heads**: 32 (KV heads: 8)
 - **Vocab size**: 180,000
 - **Max position embeddings**: 40,960 (model config)
-- **Generation defaults** (from `generation_config.json`)
   - `temperature=0.6`
   - `top_p=0.95`
   - `top_k=20`
-Note: the original base checkpoint name was not saved in `config.json` (`_name_or_path` is `null`). This model is from the **Qwen3** family and is intended to be used with recent `transformers`.
 ## Training data (token counts)
@@ -42,7 +43,7 @@ This model was trained on a mixture of:
 Total: **2.0B tokens**.
-## Training process (high-level)
 We trained NeuronAI-Uzbek in stages:
@@ -55,17 +56,6 @@ We trained NeuronAI-Uzbek in stages:
    - Continued training / adaptation on the mixed corpus (2.0B tokens total) to improve Uzbek capability while retaining English.
 3. **Supervised fine-tuning (SFT)**
-   - Final fine-tuning checkpoint is stored under `runs/honest_sft/final` during training and uploaded here.
-   - Key hyperparameters recovered from `training_args.bin`:
-     - **Epochs**: 1
-     - **Learning rate**: 5e-6
-     - **Scheduler**: cosine, **warmup_ratio**: 0.03
-     - **Optimizer**: `paged_adamw_8bit`
-     - **Per-device train batch size**: 2
-     - **Gradient accumulation**: 4
-     - **Gradient checkpointing**: enabled
-     - **Seed**: 42
-     - **bf16**: enabled
 4. **Export**
    - Exported weights to `safetensors` shards + index.
@@ -174,4 +164,4 @@ If you use this model, please cite the repository:
   howpublished = {\url{https://huggingface.co/NeuronUz/NeuronAI-Uzbek}},
   year         = {2025}
 }
-```

 tags:
 - uzbek
 - english
 - chat
 - transformers
 pipeline_tag: text-generation
 library_name: transformers
 license: other
+base_model:
+- Qwen/Qwen3-4B
 ---
 # NeuronAI-Uzbek
 - **Attention heads**: 32 (KV heads: 8)
 - **Vocab size**: 180,000
 - **Max position embeddings**: 40,960 (model config)
+- **Generation defaults**
   - `temperature=0.6`
   - `top_p=0.95`
   - `top_k=20`
+Note: This model is from the **Qwen3** family and is intended to be used with recent `transformers`.
 ## Training data (token counts)
 Total: **2.0B tokens**.
+## Training process
 We trained NeuronAI-Uzbek in stages:
    - Continued training / adaptation on the mixed corpus (2.0B tokens total) to improve Uzbek capability while retaining English.
 3. **Supervised fine-tuning (SFT)**
 4. **Export**
    - Exported weights to `safetensors` shards + index.
   howpublished = {\url{https://huggingface.co/NeuronUz/NeuronAI-Uzbek}},
   year         = {2025}
 }
+```