File size: 6,424 Bytes

---
language:
- ar
- en
tags:
- code
- arabic
- gguf
- code-explanation
- text-generation
license: apache-2.0
---

# 🐪 AraCode-7B-GGUF

**The first open-source Arabic-specialized code explanation and generation model.**

AraCode-7B understands, explains, and generates code in Arabic — a capability no existing model provides with such precision. Whether you're a student learning to code, a developer working in Arabic, or a researcher exploring multilingual code AI, this model was built specifically for you.

---

## 🌟 What makes AraCode-7B different?

Existing code models (CodeLlama, StarCoder, DeepSeek-Coder) generate excellent code but only communicate effectively in English. On the other hand, general Arabic LLMs (Jais, ALLaM, Falcon-Arabic) handle Arabic beautifully but were never natively optimized for strict coding tasks. 

**AraCode-7B bridges this gap.** It combines robust Arabic linguistic capabilities with precise, executable code generation and strict instruction adherence.

---

## 📊 Comprehensive Benchmarks

We evaluated **AraCode-7B** using both custom coding benchmarks and standardized frameworks (IFEval, AraGen) to compare its performance against the latest state-of-the-art Arabic and multilingual models.

### 1. Code Generation & Understanding (Zero-Shot)
Tested on a custom Arabic benchmark measuring raw coding capability, algorithmic logic, and debugging.

| Model | Code Gen (%) | Explain (%) | Debug (%) | Translate NL->Code (%) | Total Score |
|:---|:---:|:---:|:---:|:---:|:---:|
| **AraCode-7B (Ours)** | **90.0%** | **92.5%** | **100.0%** | **94.0%** | **94.12%** |
| ALLaM-7B-Instruct | 45.0% | 86.2% | 100.0% | 90.0% | 80.30% |

> **Key Takeaway:** AraCode-7B achieves a massive **90% in executable Code Generation**. Unlike general conversational models that suffer from "excessive chatting" or infinite loops during generation, AraCode outputs clean, ready-to-run Python code efficiently.

### 2. Instruction Following (IFEval - Arabic)
Evaluated on strict instruction adherence (e.g., "output only code", "start with a specific word"). *Competitor scores are based on published strict 0-shot IFEval (ar) benchmarks.*

| Model | IFEval (Arabic) (%) |
|:---|:---:|
| **AraCode-7B (Ours - Local Eval)** | **80.00%** |
| Jais-2-8B | 37.92% |
| Qwen2.5-7B-Instruct | 33.21% |
| ALLaM-7B-Instruct-preview | 19.40% |
| Llama-3.1-8B-Instruct | 10.87% |

> **Key Takeaway:** AraCode-7B excels at instruction following. For developers, this means the model respects formatting constraints (like returning raw code without Markdown blocks) far better than general-purpose LLMs.

### 3. Cultural Alignment & Safety (AraGen 3C3H Framework)
Evaluated on Cultural awareness, Helpfulness, Harmlessness, Honesty, and Humility. *Competitor scores are based on published AraGen 12-24 benchmarks.*

| Model | AraGen 3C3H Average (%) |
|:---|:---:|
| Jais-2-8B | 67.20% |
| Qwen2.5-7B-Instruct | 53.20% |
| **AraCode-7B (Ours - Local Eval)** | **50.00%** |
| Llama-3.1-8B-Instruct | 40.65% |

> **Key Takeaway:** AraCode-7B maintains a healthy balance (50%) in safety and cultural alignment. As a domain-specific model optimized for logic and programming, it successfully avoids the "alignment tax"—ensuring that strict conversational guardrails do not degrade its primary function as a coding assistant.

---

## 🚀 Quickstart

You can easily run this model locally using popular GGUF tools.

**Using llama.cpp:**
```bash
llama-cli -hf rahimdzx/AraCode-7B-GGUF --jinja
```

**Using Ollama:**
```bash
ollama run hf.co/rahimdzx/AraCode-7B-GGUF
```

**Using llama-cpp-python:**
```python
from llama_cpp import Llama

llm = Llama.from_pretrained(
    repo_id="rahimdzx/AraCode-7B-GGUF",
    filename="aracode-7b.Q4_K_M.gguf",
    n_gpu_layers=-1,
    n_ctx=2048
)

response = llm.create_chat_completion(
    messages=[
        {"role": "user", "content": "اكتب دالة بايثون للبحث الثنائي (Binary Search)."}
    ]
)
print(response["choices"][0]["message"]["content"])
```

---
---

## 💻 Example Usage

**Input:**
> اشرح الكود التالي بالعربية:
> ```python
> def binary_search(arr, target):
>     lo, hi = 0, len(arr) - 1
>     while lo <= hi:
>         mid = (lo + hi) // 2
>         if arr[mid] == target:
>             return mid
>         elif arr[mid] < target:
>             lo = mid + 1
>         else:
>             hi = mid - 1
>     return -1
> ```

**Output:**
> ### شرح الكود باللغة العربية
> 
> #### 1. **تعريف الدالة**:
> الدالة المعرفة في الكود تُدعى `binary_search`، والتي تستخدم للبحث عن عنصر معين في مصفوفة مرتبة.
> 
> #### 2. **المدخلات**:
> - `arr`: هي المصفوفة المرتبة التي نبحث فيها عن العنصر.
> - `target`: هو العنصر الذي نريد العثور عليه في المصفوفة.
> 
> #### 3. **المخرجات**:
> الدالة تعيد الفهرس (index) للعنصر إذا تم العثور عليه، أو `-1` إذا لم يتم العثور عليه.
> 
> #### 4. **الخطوات**:
> - **تثبيت الحدود**: نبدأ بتعريف متغيرين `lo` و `hi`، حيث `lo` يمثل بداية المصفوفة (0) و `hi` يمثل النهاية (طول المصفوفة - 1).
> - **تكرار البحث**: نستخدم حلقة `while` للتحقق من أن `lo` أقل أو يساوي `hi`.
> - **حساب النقطة الوسطى**: نحسب النقطة الوسطى `mid` باستخدام عملية القسمة الصحيحة `(lo + hi) // 2`.
> - **مقارنة العنصر**: 
>   - إذا كان `arr[mid] == target`، نعيد الفهرس `mid`.
>   - إذا كان أقل، نبحث في النصف الأيمن (`lo = mid + 1`).
>   - إذا كان أكبر، نبحث في النصف الأيسر (`hi = mid - 1`).
> 
> #### مثال عملي:
> إذا كان لدينا المصفوفة `arr = [1, 2, 3, 4, 5, 6, 7, 8, 9]` والهدف `target = 5`، الدالة ستعيد الفهرس `4`.
> 
> #### ملخص:
> تستخدم الدالة تقنية البحث الثنائي بكفاءة عالية وبتعقيد زمني O(log n)، مما يجعلها ممتازة للمصفوفات الكبيرة.
Github :https://github.com/Rahimdzx/AraCode-7B
## 📄 License
This model is released under the **Apache 2.0** license.