Omartificial-Intelligence-Space's picture

Update README.md

48795dc verified 12 days ago

5.96 kB

	---
	language:
	- ar
	license: apache-2.0
	base_model: unsloth/functiongemma-270m-it
	tags:
	- function-calling
	- arabic
	- tool-use
	- agentic
	- gemma
	- fine-tuned
	datasets:
	- AISA-Framework/AISA-AR-FunctionCall
	pipeline_tag: text-generation
	library_name: transformers
	---


	# AISA-AR-FunctionCall-FT

	<p align="center">
	<img src="https://cdn-uploads.huggingface.co/production/uploads/628f7a71dd993507cfcbe587/vnL90Tybn1528x21dMNsd.png" width="700"/>
	</p>

	Reliable Arabic Structured Tool Calling via Data-Centric Fine-Tuning

	`AISA-AR-FunctionCall-FT` is a fully fine-tuned Arabic function-calling model built on top of [FunctionGemma (Gemma 3 270M)](https://huggingface.co/unsloth/functiongemma-270m-it) and optimized for structured tool invocation in Arabic agentic systems.

	The model converts natural Arabic requests into structured executable API calls, enabling reliable integration between language models and external tools.

	> This model is part of the AISA (Agentic AI Systems Architecture) initiative.


	## Try the Model in Google Colab

	You can run a full inference example using the notebook below.

	[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1zTBeIEvb66AO6GVWZCkY-8PyYM01KQyO?usp=sharing)

	The notebook demonstrates:

	- Loading the model
	- Defining tool schemas
	- Generating structured tool calls
	- Parsing function call outputs

	---

	## Model Overview

	\| Field \| Value \|
	\|---\|---\|
	\| Model name \| AISA-AR-FunctionCall-FT \|
	\| Base model \| unsloth/functiongemma-270m-it \|
	\| Architecture \| Gemma 3 (270M parameters) \|
	\| Fine-tuning type \| Full-parameter supervised fine-tuning \|
	\| Primary task \| Arabic function calling / tool invocation \|

	The model is designed to translate Arabic natural language requests into structured tool calls following the FunctionGemma tool-calling format.

	---

	## Key Capabilities

	- Arabic natural language → structured API calls
	- Multi-dialect Arabic understanding
	- Tool selection and argument extraction
	- Structured execution environments

	Supported domains:

	\| Domain \|
	\|---\|
	\| Travel \|
	\| Utilities \|
	\| Islamic services \|
	\| Weather \|
	\| Healthcare \|
	\| Banking & finance \|
	\| E-commerce \|
	\| Government services \|

	---

	## Dataset

	The model is trained on AISA-AR-FunctionCall — a production-ready Arabic function-calling dataset built through a rigorous data-centric pipeline:

	- Dataset auditing
	- Schema normalization
	- Enum correction
	- Tool pruning
	- Prompt restructuring
	- Tool sampling

	Dataset splits:

	\| Split \| Samples \|
	\|---\|---\|
	\| Train \| 41,104 \|
	\| Validation \| 4,568 \|
	\| Test \| 5,079 \|

	Dataset includes:
	- 5 Arabic dialects
	- 8 real-world domains
	- 27 tool schemas
	- Structured tool-call annotations

	Dataset: [AISA-Framework/AISA-AR-FunctionCall](https://huggingface.co/datasets/AISA-Framework/AISA-AR-FunctionCall)

	---

	## Training Methodology

	The model was trained using a data-centric fine-tuning pipeline designed to stabilize structured execution.

	Key pipeline steps:

	1. Structural dataset auditing
	2. Enum constraint repair
	3. Tool schema normalization
	4. Tool pruning (36 → 27 tools)
	5. Tool sampling to prevent prompt truncation
	6. FunctionGemma-compatible chat serialization
	7. Completion-only supervised fine-tuning

	Training configuration:

	\| Parameter \| Value \|
	\|---\|---\|
	\| Model size \| 270M \|
	\| Training type \| Full fine-tuning \|
	\| Epochs \| 2 \|
	\| Effective batch size \| 32 \|
	\| Learning rate \| 2e-5 \|
	\| Optimizer \| 8-bit AdamW \|
	\| Scheduler \| Cosine \|
	\| Precision \| BF16 \|
	\| Gradient checkpointing \| Enabled \|

	---

	## Evaluation Results

	Evaluation was performed on a held-out test set of 5,079 samples.

	### Clean Positive Evaluation (n = 2,873)

	\| Metric \| Baseline \| AISA-AR-FunctionCall-FT \|
	\|---\|---\|---\|
	\| Function Name Accuracy \| 0.0804 \| 0.6547 \|
	\| Full Tool-Call Match \| 0.0056 \| 0.3362 \|
	\| Argument Key F1 \| 0.0600 \| 0.5728 \|
	\| Argument Exact Match \| 0.0422 \| 0.6377 \|
	\| Parse Failure Rate \| 0.8726 \| 0.0084 \|
	\| Format Validity \| 0.1274 \| 0.9916 \|
	\| Hallucination Rate \| 0.0003 \| 0.0226 \|

	> Key improvement: Parse failure reduced from 87% → <1%

	### Dialect Performance

	\| Dialect \| Function Accuracy \|
	\|---\|---\|
	\| MSA \| 0.761 \|
	\| Gulf \| 0.697 \|
	\| Egyptian \| 0.683 \|
	\| Levantine \| 0.694 \|
	\| Maghrebi \| 0.616 \|

	Fine-tuning significantly reduces dialect disparity compared to the baseline model.

	---

	## Known Limitations

	Remaining errors are primarily semantic, including:

	- Tool selection ambiguity
	- Argument mismatches
	- Domain overlap (e.g., weather vs. air quality)

	Structured formatting errors are largely eliminated.

	---

	## Example Usage

	Prompt:

	```
	ما حالة الطقس في الرياض اليوم؟
	```

	Model output:

	```
	<start_function_call>
	call:get_weather{
	city:<escape>الرياض<escape>,
	days:1
	}
	<end_function_call>
	```

	The structured call can then be executed by the application runtime.

	---

	## Intended Use

	This model is designed for:

	- Arabic AI assistants
	- Tool-based agents
	- Structured API orchestration
	- Arabic enterprise automation
	- Research on multilingual tool calling

	### Out-of-Scope Uses

	This model is not designed for:

	- General chatbots or open-ended conversation
	- Sensitive decision-making systems
	- Safety-critical deployments without additional validation

	---

	## Related Models

	\| Model \| Description \|
	\|---\|---\|
	\| [AISA-AR-FunctionCall-Think](https://huggingface.co/AISA-Framework/AISA-AR-FunctionCall-Think) \| Reasoning-augmented tool-calling model \|

	---

	## AISA Framework

	This model is part of the AISA initiative for building reliable agentic AI systems.

	Model collection: [AISA-Framework/aisa-arabic-functioncall-datasets-and-models](https://huggingface.co/collections/AISA-Framework/aisa-arabic-functioncall-datasets-and-models)

	---

	## License

	[Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)