Qwen3.5-9B Saudi Dialect

This repository contains merged full weights for Saudi-dialect chat generation, not just LoRA adapters. The model was fine-tuned from unsloth/Qwen3.5-9B with Unsloth LoRA SFT on Saudi Arabic conversations, then merged into a standalone merged_16bit checkpoint for direct use with plain transformers or Unsloth.

Model details

  • Base model: unsloth/Qwen3.5-9B
  • Training style: Unsloth LoRA SFT
  • System prompt: أنت مساعد مفيد يتحدث باللهجة السعودية العامية.
  • Max sequence length: 4096

Training data and setup

  • Dataset: HeshamHaroon/saudi-dialect-conversations
  • Raw dataset size: 3545
  • Post-filter split: 3366 train / 179 eval
  • Eval split: 5%
  • Seed: 3407
  • LoRA config: r=16, alpha=16, dropout=0
  • Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • Batch size: 16
  • Gradient accumulation: 4
  • Effective batch size: 64
  • Epochs: 4
  • Learning rate: 4e-4
  • Warmup steps: 5
  • Optimizer: adamw_8bit
  • Packing: enabled
  • Hardware/runtime: NVIDIA A100-SXM4-80GB, bf16, 2608.5s (43.48 min)

Results

  • Final eval/loss: 1.41955
  • Final logged train/loss: 1.10579
  • Trainer aggregate training_loss: 1.3807367114525921
    This is the run-level average reported by the trainer, not the last logged training step.
  • Peak reserved memory: 57.799 GB
  • LoRA-attributed reserved memory: 40.145 GB
  • Peak memory share: 72.93%

From the provided loss screenshot, eval loss drops from about 1.47 early in training to about 1.40 around the middle of the run, then rises slightly to about 1.42 near step 200. Train loss falls from about 3.1 at the start to about 1.1 by the end.

The published repository is the final merged checkpoint from the run, not an explicitly selected best-eval checkpoint.

Usage

Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

repo_id = "AyoubChLin/Qwen3.5-9B-saudi-dialect"

tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForCausalLM.from_pretrained(
    repo_id,
    torch_dtype="auto",
    device_map="auto",
)

messages = [
    {"role": "system", "content": "أنت مساعد مفيد يتحدث باللهجة السعودية العامية."},
    {"role": "user", "content": "كيف حالك اليوم؟"},
]

input_ids = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    enable_thinking=False,
    return_tensors="pt",
).to(model.device)

outputs = model.generate(
    input_ids,
    max_new_tokens=200,
    temperature=0.7,
    top_p=0.9,
)

print(tokenizer.decode(outputs[0][input_ids.shape[-1]:], skip_special_tokens=True))

Install Unsloth

%%capture
import re, torch

v = re.match(r"[\d]{1,}\.[\d]{1,}", str(torch.__version__)).group(0)
xformers = "xformers==" + {
    "2.10": "0.0.34",
    "2.9": "0.0.33.post1",
    "2.8": "0.0.32.post2",
}.get(v, "0.0.34")

!pip install sentencepiece protobuf "datasets>=2.18.0" "huggingface_hub>=0.34.0" hf_transfer wandb
!pip install --no-deps unsloth_zoo bitsandbytes accelerate {xformers} peft trl triton unsloth
!pip install -q "transformers>=5.0.0"
!pip install -q --no-deps "trl>=0.15.0"

Unsloth

This repo was pushed as merged_16bit, so load it with load_in_4bit=False.

from unsloth import FastLanguageModel

repo_id = "AyoubChLin/Qwen3.5-9B-saudi-dialect"
max_seq_length = 4096

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name=repo_id,
    max_seq_length=max_seq_length,
    load_in_4bit=False,  # this repo was pushed as merged_16bit
)

FastLanguageModel.for_inference(model)

messages = [
    {
        "role": "system",
        "content": [
            {"type": "text", "text": "أنت مساعد مفيد يتحدث باللهجة السعودية العامية."}
        ],
    },
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "كيف حالك اليوم؟"}
        ],
    },
]

input_ids = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    enable_thinking=False,
    return_tensors="pt",
).to(model.device)

output_ids = model.generate(
    input_ids=input_ids,
    max_new_tokens=200,
    use_cache=True,
    temperature=0.7,
    top_p=0.9,
)

response = tokenizer.decode(
    output_ids[0][input_ids.shape[-1]:],
    skip_special_tokens=True,
)
print(response)

Related artifacts

  • LoRA adapters: AyoubChLin/Qwen3.5-9B-saudi-dialect-lora

Limitations

  • No external benchmark evaluation is included beyond train/eval loss on the source dataset.
  • Saudi dialect coverage is likely uneven across regions, phrasing styles, and topics.
  • The model can still hallucinate, over-generalize, or drift toward more formal Arabic depending on the prompt.
Downloads last month
291
Safetensors
Model size
10B params
Tensor type
BF16
·
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AyoubChLin/Qwen3.5-9B-saudi-dialect

Finetuned
Qwen/Qwen3.5-9B
Finetuned
(27)
this model

Dataset used to train AyoubChLin/Qwen3.5-9B-saudi-dialect

Collection including AyoubChLin/Qwen3.5-9B-saudi-dialect