Qwen3.5-9B Saudi Dialect
This repository contains merged full weights for Saudi-dialect chat generation, not just LoRA adapters. The model was fine-tuned from unsloth/Qwen3.5-9B with Unsloth LoRA SFT on Saudi Arabic conversations, then merged into a standalone merged_16bit checkpoint for direct use with plain transformers or Unsloth.
Model details
- Base model:
unsloth/Qwen3.5-9B - Training style:
Unsloth LoRA SFT - System prompt:
أنت مساعد مفيد يتحدث باللهجة السعودية العامية. - Max sequence length:
4096
Training data and setup
- Dataset:
HeshamHaroon/saudi-dialect-conversations - Raw dataset size:
3545 - Post-filter split:
3366train /179eval - Eval split:
5% - Seed:
3407 - LoRA config:
r=16,alpha=16,dropout=0 - Target modules:
q_proj,k_proj,v_proj,o_proj,gate_proj,up_proj,down_proj - Batch size:
16 - Gradient accumulation:
4 - Effective batch size:
64 - Epochs:
4 - Learning rate:
4e-4 - Warmup steps:
5 - Optimizer:
adamw_8bit - Packing: enabled
- Hardware/runtime:
NVIDIA A100-SXM4-80GB,bf16,2608.5s(43.48 min)
Results
- Final
eval/loss:1.41955 - Final logged
train/loss:1.10579 - Trainer aggregate
training_loss:1.3807367114525921
This is the run-level average reported by the trainer, not the last logged training step. - Peak reserved memory:
57.799 GB - LoRA-attributed reserved memory:
40.145 GB - Peak memory share:
72.93%
From the provided loss screenshot, eval loss drops from about 1.47 early in training to about 1.40 around the middle of the run, then rises slightly to about 1.42 near step 200. Train loss falls from about 3.1 at the start to about 1.1 by the end.
The published repository is the final merged checkpoint from the run, not an explicitly selected best-eval checkpoint.
Usage
Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
repo_id = "AyoubChLin/Qwen3.5-9B-saudi-dialect"
tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForCausalLM.from_pretrained(
repo_id,
torch_dtype="auto",
device_map="auto",
)
messages = [
{"role": "system", "content": "أنت مساعد مفيد يتحدث باللهجة السعودية العامية."},
{"role": "user", "content": "كيف حالك اليوم؟"},
]
input_ids = tokenizer.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=True,
enable_thinking=False,
return_tensors="pt",
).to(model.device)
outputs = model.generate(
input_ids,
max_new_tokens=200,
temperature=0.7,
top_p=0.9,
)
print(tokenizer.decode(outputs[0][input_ids.shape[-1]:], skip_special_tokens=True))
Install Unsloth
%%capture
import re, torch
v = re.match(r"[\d]{1,}\.[\d]{1,}", str(torch.__version__)).group(0)
xformers = "xformers==" + {
"2.10": "0.0.34",
"2.9": "0.0.33.post1",
"2.8": "0.0.32.post2",
}.get(v, "0.0.34")
!pip install sentencepiece protobuf "datasets>=2.18.0" "huggingface_hub>=0.34.0" hf_transfer wandb
!pip install --no-deps unsloth_zoo bitsandbytes accelerate {xformers} peft trl triton unsloth
!pip install -q "transformers>=5.0.0"
!pip install -q --no-deps "trl>=0.15.0"
Unsloth
This repo was pushed as merged_16bit, so load it with load_in_4bit=False.
from unsloth import FastLanguageModel
repo_id = "AyoubChLin/Qwen3.5-9B-saudi-dialect"
max_seq_length = 4096
model, tokenizer = FastLanguageModel.from_pretrained(
model_name=repo_id,
max_seq_length=max_seq_length,
load_in_4bit=False, # this repo was pushed as merged_16bit
)
FastLanguageModel.for_inference(model)
messages = [
{
"role": "system",
"content": [
{"type": "text", "text": "أنت مساعد مفيد يتحدث باللهجة السعودية العامية."}
],
},
{
"role": "user",
"content": [
{"type": "text", "text": "كيف حالك اليوم؟"}
],
},
]
input_ids = tokenizer.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=True,
enable_thinking=False,
return_tensors="pt",
).to(model.device)
output_ids = model.generate(
input_ids=input_ids,
max_new_tokens=200,
use_cache=True,
temperature=0.7,
top_p=0.9,
)
response = tokenizer.decode(
output_ids[0][input_ids.shape[-1]:],
skip_special_tokens=True,
)
print(response)
Related artifacts
- LoRA adapters:
AyoubChLin/Qwen3.5-9B-saudi-dialect-lora
Limitations
- No external benchmark evaluation is included beyond train/eval loss on the source dataset.
- Saudi dialect coverage is likely uneven across regions, phrasing styles, and topics.
- The model can still hallucinate, over-generalize, or drift toward more formal Arabic depending on the prompt.
- Downloads last month
- 291
Model tree for AyoubChLin/Qwen3.5-9B-saudi-dialect
Dataset used to train AyoubChLin/Qwen3.5-9B-saudi-dialect
Collection including AyoubChLin/Qwen3.5-9B-saudi-dialect
Collection
A collection of Saudi Arabic chat models fine-tuned on the same HeshamHaroon/saudi-dialect-conversations dataset. All checkpoints follow the same supe • 5 items • Updated • 1