File size: 5,957 Bytes
95c1548 cb123ca 95c1548 cb123ca 3945705 cb123ca 0945968 cb123ca 48795dc cb123ca 95c1548 cb123ca 95c1548 cb123ca 95c1548 cb123ca 95c1548 cb123ca | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 | ---
language:
- ar
license: apache-2.0
base_model: unsloth/functiongemma-270m-it
tags:
- function-calling
- arabic
- tool-use
- agentic
- gemma
- fine-tuned
datasets:
- AISA-Framework/AISA-AR-FunctionCall
pipeline_tag: text-generation
library_name: transformers
---
# AISA-AR-FunctionCall-FT
<p align="center">
<img src="https://cdn-uploads.huggingface.co/production/uploads/628f7a71dd993507cfcbe587/vnL90Tybn1528x21dMNsd.png" width="700"/>
</p>
**Reliable Arabic Structured Tool Calling via Data-Centric Fine-Tuning**
`AISA-AR-FunctionCall-FT` is a fully fine-tuned Arabic function-calling model built on top of [FunctionGemma (Gemma 3 270M)](https://huggingface.co/unsloth/functiongemma-270m-it) and optimized for structured tool invocation in Arabic agentic systems.
The model converts natural Arabic requests into structured executable API calls, enabling reliable integration between language models and external tools.
> This model is part of the **AISA** (Agentic AI Systems Architecture) initiative.
## Try the Model in Google Colab
You can run a full inference example using the notebook below.
[](https://colab.research.google.com/drive/1zTBeIEvb66AO6GVWZCkY-8PyYM01KQyO?usp=sharing)
The notebook demonstrates:
- Loading the model
- Defining tool schemas
- Generating structured tool calls
- Parsing function call outputs
---
## Model Overview
| Field | Value |
|---|---|
| **Model name** | AISA-AR-FunctionCall-FT |
| **Base model** | unsloth/functiongemma-270m-it |
| **Architecture** | Gemma 3 (270M parameters) |
| **Fine-tuning type** | Full-parameter supervised fine-tuning |
| **Primary task** | Arabic function calling / tool invocation |
The model is designed to translate Arabic natural language requests into structured tool calls following the FunctionGemma tool-calling format.
---
## Key Capabilities
- Arabic natural language → structured API calls
- Multi-dialect Arabic understanding
- Tool selection and argument extraction
- Structured execution environments
**Supported domains:**
| Domain |
|---|
| Travel |
| Utilities |
| Islamic services |
| Weather |
| Healthcare |
| Banking & finance |
| E-commerce |
| Government services |
---
## Dataset
The model is trained on **AISA-AR-FunctionCall** — a production-ready Arabic function-calling dataset built through a rigorous data-centric pipeline:
- Dataset auditing
- Schema normalization
- Enum correction
- Tool pruning
- Prompt restructuring
- Tool sampling
**Dataset splits:**
| Split | Samples |
|---|---|
| Train | 41,104 |
| Validation | 4,568 |
| Test | 5,079 |
**Dataset includes:**
- 5 Arabic dialects
- 8 real-world domains
- 27 tool schemas
- Structured tool-call annotations
Dataset: [AISA-Framework/AISA-AR-FunctionCall](https://huggingface.co/datasets/AISA-Framework/AISA-AR-FunctionCall)
---
## Training Methodology
The model was trained using a **data-centric fine-tuning pipeline** designed to stabilize structured execution.
**Key pipeline steps:**
1. Structural dataset auditing
2. Enum constraint repair
3. Tool schema normalization
4. Tool pruning (36 → 27 tools)
5. Tool sampling to prevent prompt truncation
6. FunctionGemma-compatible chat serialization
7. Completion-only supervised fine-tuning
**Training configuration:**
| Parameter | Value |
|---|---|
| Model size | 270M |
| Training type | Full fine-tuning |
| Epochs | 2 |
| Effective batch size | 32 |
| Learning rate | 2e-5 |
| Optimizer | 8-bit AdamW |
| Scheduler | Cosine |
| Precision | BF16 |
| Gradient checkpointing | Enabled |
---
## Evaluation Results
Evaluation was performed on a held-out test set of **5,079 samples**.
### Clean Positive Evaluation (n = 2,873)
| Metric | Baseline | AISA-AR-FunctionCall-FT |
|---|---|---|
| Function Name Accuracy | 0.0804 | **0.6547** |
| Full Tool-Call Match | 0.0056 | **0.3362** |
| Argument Key F1 | 0.0600 | **0.5728** |
| Argument Exact Match | 0.0422 | **0.6377** |
| Parse Failure Rate | 0.8726 | **0.0084** |
| Format Validity | 0.1274 | **0.9916** |
| Hallucination Rate | 0.0003 | 0.0226 |
> **Key improvement:** Parse failure reduced from **87% → <1%**
### Dialect Performance
| Dialect | Function Accuracy |
|---|---|
| MSA | 0.761 |
| Gulf | 0.697 |
| Egyptian | 0.683 |
| Levantine | 0.694 |
| Maghrebi | 0.616 |
Fine-tuning significantly reduces dialect disparity compared to the baseline model.
---
## Known Limitations
Remaining errors are primarily **semantic**, including:
- Tool selection ambiguity
- Argument mismatches
- Domain overlap (e.g., weather vs. air quality)
Structured formatting errors are largely eliminated.
---
## Example Usage
**Prompt:**
```
ما حالة الطقس في الرياض اليوم؟
```
**Model output:**
```
<start_function_call>
call:get_weather{
city:<escape>الرياض<escape>,
days:1
}
<end_function_call>
```
The structured call can then be executed by the application runtime.
---
## Intended Use
This model is designed for:
- Arabic AI assistants
- Tool-based agents
- Structured API orchestration
- Arabic enterprise automation
- Research on multilingual tool calling
### Out-of-Scope Uses
This model is **not** designed for:
- General chatbots or open-ended conversation
- Sensitive decision-making systems
- Safety-critical deployments without additional validation
---
## Related Models
| Model | Description |
|---|---|
| [AISA-AR-FunctionCall-Think](https://huggingface.co/AISA-Framework/AISA-AR-FunctionCall-Think) | Reasoning-augmented tool-calling model |
---
## AISA Framework
This model is part of the AISA initiative for building reliable agentic AI systems.
Model collection: [AISA-Framework/aisa-arabic-functioncall-datasets-and-models](https://huggingface.co/collections/AISA-Framework/aisa-arabic-functioncall-datasets-and-models)
---
## License
[Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |