--- language: - ar license: apache-2.0 base_model: AISA-Framework/AISA-AR-FunctionCall-FT tags: - function-calling - arabic - tool-use - agentic - gemma - reasoning - lora - think datasets: - AISA-Framework/AISA-AR-FunctionCall pipeline_tag: text-generation library_name: transformers --- # AISA-AR-FunctionCall-Think

**Reasoning-Augmented Arabic Structured Tool Calling** `AISA-AR-FunctionCall-Think` is a reasoning-enhanced variant of the Arabic function-calling model introduced in the **AISA-AR-FunctionCall** framework. The model generates an intermediate reasoning trace before invoking a tool, enabling transparent decision-making for Arabic agentic systems. This model extends [AISA-AR-FunctionCall-FT](https://huggingface.co/AISA-Framework/AISA-AR-FunctionCall-FT) by introducing explicit reasoning supervision using `` blocks prior to tool execution. --- ## Model Overview | Field | Value | |---|---| | **Model name** | AISA-AR-FunctionCall-Think | | **Base model** | AISA-AR-FunctionCall-FT | | **Architecture** | Gemma 3 (FunctionGemma 270M) | | **Training method** | LoRA reasoning fine-tuning | | **Primary task** | Arabic reasoning-aware function calling | The model produces outputs in the following pattern: ``` reasoning about tool selection call:tool_name{arguments} ``` This allows the system to expose the reasoning behind tool selection. --- ## Key Capabilities - Reasoning-aware tool selection - Explicit decision traces for tool invocation - Improved argument extraction consistency - Interpretable structured execution **Supported domains:** | Domain | |---| | Travel | | Utilities | | Islamic services | | Weather | | Healthcare | | Banking & finance | | E-commerce | | Government services | **Supported Arabic dialect groups:** - Modern Standard Arabic (MSA) - Gulf - Egyptian - Levantine - Maghrebi --- ## Training Dataset Training uses a subset of the [AISA-AR-FunctionCall](https://huggingface.co/datasets/AISA-Framework/AISA-AR-FunctionCall) dataset with reasoning annotations. | Property | Value | |---|---| | Dataset size | ~12k reasoning-augmented samples | | Dialect coverage | 5 Arabic dialects | | Domains | 8 real-world domains | | Tools | 27 structured tools | --- ## Training Methodology The reasoning model is trained by augmenting assistant outputs with explicit reasoning segments. **Training format:** ``` tool selection reasoning call:tool{arguments} ``` Reasoning supervision is enforced during inference by priming the model to begin its generation with ``. **Training configuration:** | Parameter | Value | |---|---| | Training type | LoRA fine-tuning | | LoRA rank | 64 | | Alpha | 64 | | Dropout | 0.05 | | Trainable parameters | ~5.36% | | Epochs | 3 | | Learning rate | 3e-6 | | Effective batch size | 32 | | Optimizer | 8-bit AdamW | | Scheduler | Cosine | Additional training signals include **negative tool examples** to reduce hallucinated tool calls when no tool invocation is required. --- ## Evaluation Results Evaluation is performed on a strict reasoning evaluation subset. ### Strict Evaluation (n = 240) | Metric | Score | |---|---| | Tool Call Rate | 0.992 | | Think-Before-Call Rate | **1.000** | | Function Name Accuracy | 0.992 | | Argument F1 | **1.000** | | Decision Accuracy | 0.992 | | Hallucination Rate | **0.000** | These results indicate that the model consistently performs reasoning before tool invocation and achieves near-perfect structured alignment within the evaluated subset. ### Important Note on Format Validation Standard function-call validators may classify reasoning outputs as **parse failures** because `` tokens appear before the function call marker. This does **not** indicate structural instability — it reflects a difference in serialization format. When reasoning segments are permitted, tool invocation correctness remains near-perfect. --- ## Example Usage **User query:** ``` ما حالة الطقس في الرياض اليوم؟ ``` **Model output:** ``` المستخدم يريد معرفة حالة الطقس في مدينة الرياض، لذا يجب استخدام أداة get_weather. call:get_weather{city:الرياض,days:1} ``` --- ## Intended Use This model is intended for: - Research on reasoning-aware tool calling - Interpretable agent systems - Arabic reasoning supervision experiments - Debugging tool selection behavior ### Production Recommendation This model is an **exploratory research variant**. For production deployment, we recommend using: [AISA-AR-FunctionCall-FT](https://huggingface.co/AISA-Framework/AISA-AR-FunctionCall-FT) --- ## Related Resources | Resource | Link | |---|---| | Dataset | [AISA-Framework/AISA-AR-FunctionCall](https://huggingface.co/datasets/AISA-Framework/AISA-AR-FunctionCall) | | Production model | [AISA-AR-FunctionCall-FT](https://huggingface.co/AISA-Framework/AISA-AR-FunctionCall-FT) | | Model collection | [AISA Arabic FunctionCall](https://huggingface.co/collections/AISA-Framework/aisa-arabic-functioncall-datasets-and-models) | --- ## Paper **From Language to Action in Arabic: Reliable Structured Tool Calling via Data-Centric Fine-Tuning** *AISA Framework* --- ## AISA Framework This model is part of the **AISA** (Agentic AI Systems Architecture) initiative for building reliable multilingual AI agents. --- ## License [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)