Qwen3.5-4B Saudi Dialect

This model is a Saudi dialect conversational fine-tune of unsloth/Qwen3.5-4B, trained from the notebook qwen3-5-4b-saudi-dialect-sft-modal.ipynb and pushed to Hugging Face as a merged standalone model:

The training setup uses Unsloth + TRL SFTTrainer with LoRA adapters and then merges the adapters back into the base model for easier deployment.

Model Details

  • Base model: unsloth/Qwen3.5-4B
  • Fine-tuning method: LoRA SFT
  • Language: Arabic, focused on Saudi dialect conversations
  • Training modality in this run: text-only conversational SFT
  • Dataset split: 3545 total examples -> 3366 train / 179 eval
  • System prompt used in training: أنت مساعد مفيد يتحدث باللهجة السعودية العامية.
  • Tracking: Weights & Biases
  • W&B run: https://wandb.ai/cherguelainea/qwen-saudi-dialect/runs/6udmlaan

Training Arguments

Argument Value
max_seq_length 4096
load_in_4bit False
load_in_8bit False
lora_r 16
lora_alpha 16
lora_dropout 0
target_modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
use_gradient_checkpointing "unsloth"
per_device_train_batch_size 16
per_device_eval_batch_size 16
gradient_accumulation_steps 4
Effective global batch size 64
warmup_steps 5
num_train_epochs 4
learning_rate 4e-4
lr_scheduler_type linear
optim adamw_8bit
weight_decay 0.01
dataset_text_field messages
packing True in config, but Unsloth reported Sample packing skipped (vision-language model detected)
remove_unused_columns False
save_strategy steps
save_steps 100
eval_strategy steps
eval_steps 50
seed 3407
report_to wandb
Precision used in this run bf16

Training Results

Loss and Metrics

Metric Value
eval/loss 1.49976
train/loss (final W&B summary) 1.18529
training_loss (trainer_stats) 1.4871071903210766
train_runtime_seconds 2490.3044 s
train_runtime_minutes 41.51 min
train_samples_per_second 5.407
train_steps_per_second 0.085
eval/runtime 9.6061 s
eval/samples_per_second 18.53
eval/steps_per_second 1.249
train/global_step 212
train/epoch 4
train/grad_norm 0.69472
total_flos 7.760619536796672e+16

Trainable Parameters

Item Value
Total parameters 4,560,499,200
Trainable LoRA parameters 21,233,664
Trainable ratio 0.4656%

Hardware

Item Value
GPU NVIDIA A100-SXM4-40GB
Number of GPUs 1
CUDA toolkit 12.9
Torch 2.8.0+cu129
Transformers 5.3.0
Unsloth 2026.3.6
GPU total memory 39.494 GB
GPU memory reserved before training 8.547 GB
Peak reserved GPU memory 38.455 GB
Peak reserved GPU memory for LoRA training 29.908 GB
Peak GPU memory usage 97.37% of available GPU memory
System RAM Not logged in the notebook outputs

Recorded memory numbers above are GPU memory / VRAM measurements taken from the training run. The notebook did not record host system RAM.

Data Preparation

The dataset examples are conversation turns stored under messages. During preprocessing, a Saudi Arabic system prompt is prepended to each conversation before fine-tuning. The training notebook keeps only valid conversations and then performs a 5% evaluation split with seed 3407.

Usage

Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

repo_id = "AyoubChLin/Qwen3.5-4B-saudi-dialect"

tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForCausalLM.from_pretrained(
    repo_id,
    torch_dtype="auto",
    device_map="auto",
)

messages = [
    {"role": "system", "content": "أنت مساعد مفيد يتحدث باللهجة السعودية العامية."},
    {"role": "user", "content": "كيف حالك اليوم؟"},
]

input_ids = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    enable_thinking=False,
    return_tensors="pt",
).to(model.device)

outputs = model.generate(
    input_ids,
    max_new_tokens=200,
    temperature=0.7,
    top_p=0.9,
)

print(tokenizer.decode(outputs[0][input_ids.shape[-1]:], skip_special_tokens=True))

Unsloth

Install

%%capture
import re, torch

v = re.match(r"[\d]{1,}\.[\d]{1,}", str(torch.__version__)).group(0)
xformers = "xformers==" + {
    "2.10": "0.0.34",
    "2.9": "0.0.33.post1",
    "2.8": "0.0.32.post2",
}.get(v, "0.0.34")

!pip install sentencepiece protobuf "datasets>=2.18.0" "huggingface_hub>=0.34.0" hf_transfer wandb
!pip install --no-deps unsloth_zoo bitsandbytes accelerate {xformers} peft trl triton unsloth
!pip install -q "transformers>=5.0.0"
!pip install -q --no-deps "trl>=0.15.0"

Run

from unsloth import FastLanguageModel

repo_id = "AyoubChLin/Qwen3.5-4B-saudi-dialect"
max_seq_length = 4096

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name=repo_id,
    max_seq_length=max_seq_length,
    load_in_4bit=False,  # this repo was pushed as merged_16bit
)

FastLanguageModel.for_inference(model)

messages = [
    {
        "role": "system",
        "content": [
            {"type": "text", "text": "أنت مساعد مفيد يتحدث باللهجة السعودية العامية."}
        ],
    },
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "كيف حالك اليوم؟"}
        ],
    },
]

input_ids = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    enable_thinking=False,
    return_tensors="pt",
).to(model.device)

output_ids = model.generate(
    input_ids=input_ids,
    max_new_tokens=200,
    use_cache=True,
    temperature=0.7,
    top_p=0.9,
)

response = tokenizer.decode(
    output_ids[0][input_ids.shape[-1]:],
    skip_special_tokens=True,
)
print(response)

Notes

  • This repository contains the merged full model pushed with save_method="merged_16bit".
  • A separate LoRA adapter repository is also available: AyoubChLin/Qwen3.5-4B-saudi-dialect-lora.
  • The base checkpoint is multimodal-capable, but this fine-tune was trained on text-only dialogue data.
  • The training data is conversational and dialect-specific, so outputs may reflect biases or stylistic patterns present in the source dataset.

Downloads last month
104
Safetensors
Model size
5B params
Tensor type
BF16
·
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AyoubChLin/Qwen3.5-4B-saudi-dialect

Finetuned
Qwen/Qwen3.5-4B
Adapter
(11)
this model
Adapters
1 model

Dataset used to train AyoubChLin/Qwen3.5-4B-saudi-dialect