🇸🇾 Syrian_Qwen-3.5: The First Syrian Dialect LLM

License Language Base Model Task


🌟 Introduction

Welcome to the future of Levantine AI.

We are thrilled to introduce Syrian_Qwen-3.5, the first series of Large Language Models specifically fine-tuned to understand and generate the Syrian Arabic Dialect.

While most Arabic LLMs focus on Modern Standard Arabic (MSA/Fusha), they often fail to capture the nuance, warmth, and cultural specificity of local dialects. We changed that. By leveraging the powerful Qwen 3.5 architecture, we have fine-tuned this model not just to "speak Arabic," but to speak Syrian.

From the streets of Syria, this model understands the local idioms, slang, and cultural context that define Syrian communication.

🚀 Key Features

  • 🗣️ Native Dialect: Trained specifically on Syrian colloquial data, not just MSA.
  • 🧠 Smart & Small: Built on efficient Qwen small-model architecture for fast inference.
  • 🤝 Community First: Open weights with a strong commitment to the open-source ecosystem.

🛠️ How to Load & Use

Getting started with Syrian_Qwen-3.5 is seamless. You can load it using the standard transformers library.

Requirements

pip install transformers torch accelerate

Inference Code

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "theBOrg32/syrian_qwen_3.5_4B"

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name, 
    device_map="auto", 
    trust_remote_code=True,
    torch_dtype="auto"
)

# Prepare your Syrian Dialect prompt
prompt = "شو رأيك بالمطعم الجديد بدمشق؟" 

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

# Generate
outputs = model.generate(
    **inputs, 
    max_new_tokens=256, 
    temperature=0.5,
    do_sample=True
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

💬 Examples & Benchmarks

We tested Syrian_Qwen-3.5 against general-purpose models. Notice how our model captures the tone and dialect while others revert to formal Arabic or English.

Prompt (Syrian Dialect) Syrian_Qwen-3.5 (Ours) OpenAI Gemini
"شو بتعرف عن المنطق ؟" . المنطق بيستخدم استدلالات مختلفة متل الاستدلال الاستنتاجي والاستدلال التحليلي. الاستدلال الاستنتاجي بيستخدم القواعد والمبادئ العامة ليوصل لنتائج محددة، بينما الاستدلال التحليلي بيقوم بتحليل الأفكار والمعلومات ليوصل لفهم أعمق. المنطق كمان بيستخدم استدلالات شكلية وغير شكلية. الاستدلالات الشكلية بتستخدم قواعد ثابتة ومحددة، بينما الاستدلالات غير الشكلية بتتغير حسب السياق والمعلومات المتاحة. المنطق كمان بيستخدم استدلالات منطقية بيزيد من فهمنا للعالم اللي حولنا، المنطق هو فرع من Logic يدرس قواعد التفكير الصحيح والاستدلال،
وكيف نميّز الحجج الصحيحة من الخاطئة.
يستخدم في الفلسفة والرياضيات وعلوم الحاسوب لتحليل البراهين وبناء استنتاجات دقيقة.
المنطق هو العلم الذي يدرس القواعد العامة للتفكير الصحيح، ويهدف إلى حماية العقل من الوقوع في الخطأ من خلال وضع معايير تميز بين الحجج السليمة والباطلة.

Note: the full response is too long for all models so we trimmed, the idea here is just to show that even the SOTA models cannot handle the dialect.


⚖️ License & Commercial Use

We are strong believers in the Open Source Community. To ensure this technology remains accessible and beneficial to everyone, we have chosen a Copyleft License.

📄 License: CC-BY-SA-4.0

This model is released under the Creative Commons Attribution-ShareAlike 4.0 International License.

🤝 Usage Guidelines

  1. ✅ Open Source Projects: You are free to use, fine-tune, and distribute this model in your projects, provided your project also remains open-source and references Syrian_Qwen-3.5.
  2. ✅ Commercial Use: Commercial usage is allowed under the terms of CC-BY-SA-4.0 (your derivative models must remain open).
  3. 🔒 Closed Source / Proprietary: If you wish to integrate this model (or a fine-tuned version) into a closed-source product without releasing your weights/code, you must obtain prior approval.

📧 For Closed-Source Licensing: Please contact us at info2@the-borg.ru to discuss agreements that respect our open-source mission.


🙏 Credits & Acknowledgments

This model would not be possible without the foundational work of the Qwen Team at Alibaba Cloud. We stand on the shoulders of giants.

  • Base Model: Qwen 3.5
  • Fine-Tuning & Alignment: The Borg Organization
  • Dataset: Curated Syrian Dialect Corpus

Citation

If you use Syrian_Qwen-3.5 in your research or project, please cite us:

@misc{syrian_qwen_2026,
  title={Syrian_Qwen-3.5: The First Syrian Dialect Large Language Model},
  author={The Borg Organization},
  year={2026},
  license={CC-BY-SA-4.0}
}

Built with ❤️ for the Syrian Community & The World
Preserving language, one token at a time.

Downloads last month
19
Safetensors
Model size
5B params
Tensor type
F32
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 1 Ask for provider support

Model tree for theBOrg32/syrian_qwen_3.5_4B

Quantizations
2 models