Instructions to use AyoubChLin/Qwen3.5-9B-saudi-dialect with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use AyoubChLin/Qwen3.5-9B-saudi-dialect with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="AyoubChLin/Qwen3.5-9B-saudi-dialect") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("AyoubChLin/Qwen3.5-9B-saudi-dialect") model = AutoModelForImageTextToText.from_pretrained("AyoubChLin/Qwen3.5-9B-saudi-dialect") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use AyoubChLin/Qwen3.5-9B-saudi-dialect with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "AyoubChLin/Qwen3.5-9B-saudi-dialect" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AyoubChLin/Qwen3.5-9B-saudi-dialect", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/AyoubChLin/Qwen3.5-9B-saudi-dialect
- SGLang
How to use AyoubChLin/Qwen3.5-9B-saudi-dialect with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "AyoubChLin/Qwen3.5-9B-saudi-dialect" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AyoubChLin/Qwen3.5-9B-saudi-dialect", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "AyoubChLin/Qwen3.5-9B-saudi-dialect" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AyoubChLin/Qwen3.5-9B-saudi-dialect", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio new
How to use AyoubChLin/Qwen3.5-9B-saudi-dialect with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for AyoubChLin/Qwen3.5-9B-saudi-dialect to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for AyoubChLin/Qwen3.5-9B-saudi-dialect to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for AyoubChLin/Qwen3.5-9B-saudi-dialect to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="AyoubChLin/Qwen3.5-9B-saudi-dialect", max_seq_length=2048, ) - Docker Model Runner
How to use AyoubChLin/Qwen3.5-9B-saudi-dialect with Docker Model Runner:
docker model run hf.co/AyoubChLin/Qwen3.5-9B-saudi-dialect
Qwen3.5-9B Saudi Dialect
This repository contains merged full weights for Saudi-dialect chat generation, not just LoRA adapters. The model was fine-tuned from unsloth/Qwen3.5-9B with Unsloth LoRA SFT on Saudi Arabic conversations, then merged into a standalone merged_16bit checkpoint for direct use with plain transformers or Unsloth.
Model details
- Base model:
unsloth/Qwen3.5-9B - Training style:
Unsloth LoRA SFT - System prompt:
أنت مساعد مفيد يتحدث باللهجة السعودية العامية. - Max sequence length:
4096
Training data and setup
- Dataset:
HeshamHaroon/saudi-dialect-conversations - Raw dataset size:
3545 - Post-filter split:
3366train /179eval - Eval split:
5% - Seed:
3407 - LoRA config:
r=16,alpha=16,dropout=0 - Target modules:
q_proj,k_proj,v_proj,o_proj,gate_proj,up_proj,down_proj - Batch size:
16 - Gradient accumulation:
4 - Effective batch size:
64 - Epochs:
4 - Learning rate:
4e-4 - Warmup steps:
5 - Optimizer:
adamw_8bit - Packing: enabled
- Hardware/runtime:
NVIDIA A100-SXM4-80GB,bf16,2608.5s(43.48 min)
Results
- Final
eval/loss:1.41955 - Final logged
train/loss:1.10579 - Trainer aggregate
training_loss:1.3807367114525921
This is the run-level average reported by the trainer, not the last logged training step. - Peak reserved memory:
57.799 GB - LoRA-attributed reserved memory:
40.145 GB - Peak memory share:
72.93%
From the provided loss screenshot, eval loss drops from about 1.47 early in training to about 1.40 around the middle of the run, then rises slightly to about 1.42 near step 200. Train loss falls from about 3.1 at the start to about 1.1 by the end.
The published repository is the final merged checkpoint from the run, not an explicitly selected best-eval checkpoint.
Usage
Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
repo_id = "AyoubChLin/Qwen3.5-9B-saudi-dialect"
tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForCausalLM.from_pretrained(
repo_id,
torch_dtype="auto",
device_map="auto",
)
messages = [
{"role": "system", "content": "أنت مساعد مفيد يتحدث باللهجة السعودية العامية."},
{"role": "user", "content": "كيف حالك اليوم؟"},
]
input_ids = tokenizer.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=True,
enable_thinking=False,
return_tensors="pt",
).to(model.device)
outputs = model.generate(
input_ids,
max_new_tokens=200,
temperature=0.7,
top_p=0.9,
)
print(tokenizer.decode(outputs[0][input_ids.shape[-1]:], skip_special_tokens=True))
Install Unsloth
%%capture
import re, torch
v = re.match(r"[\d]{1,}\.[\d]{1,}", str(torch.__version__)).group(0)
xformers = "xformers==" + {
"2.10": "0.0.34",
"2.9": "0.0.33.post1",
"2.8": "0.0.32.post2",
}.get(v, "0.0.34")
!pip install sentencepiece protobuf "datasets>=2.18.0" "huggingface_hub>=0.34.0" hf_transfer wandb
!pip install --no-deps unsloth_zoo bitsandbytes accelerate {xformers} peft trl triton unsloth
!pip install -q "transformers>=5.0.0"
!pip install -q --no-deps "trl>=0.15.0"
Unsloth
This repo was pushed as merged_16bit, so load it with load_in_4bit=False.
from unsloth import FastLanguageModel
repo_id = "AyoubChLin/Qwen3.5-9B-saudi-dialect"
max_seq_length = 4096
model, tokenizer = FastLanguageModel.from_pretrained(
model_name=repo_id,
max_seq_length=max_seq_length,
load_in_4bit=False, # this repo was pushed as merged_16bit
)
FastLanguageModel.for_inference(model)
messages = [
{
"role": "system",
"content": [
{"type": "text", "text": "أنت مساعد مفيد يتحدث باللهجة السعودية العامية."}
],
},
{
"role": "user",
"content": [
{"type": "text", "text": "كيف حالك اليوم؟"}
],
},
]
input_ids = tokenizer.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=True,
enable_thinking=False,
return_tensors="pt",
).to(model.device)
output_ids = model.generate(
input_ids=input_ids,
max_new_tokens=200,
use_cache=True,
temperature=0.7,
top_p=0.9,
)
response = tokenizer.decode(
output_ids[0][input_ids.shape[-1]:],
skip_special_tokens=True,
)
print(response)
Related artifacts
- LoRA adapters:
AyoubChLin/Qwen3.5-9B-saudi-dialect-lora
Limitations
- No external benchmark evaluation is included beyond train/eval loss on the source dataset.
- Saudi dialect coverage is likely uneven across regions, phrasing styles, and topics.
- The model can still hallucinate, over-generalize, or drift toward more formal Arabic depending on the prompt.
- Downloads last month
- 165