Omartificial-Intelligence-Space commited on
Commit
49b980c
·
verified ·
1 Parent(s): 53fc7ed

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +228 -0
README.md ADDED
@@ -0,0 +1,228 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - ar
4
+ license: apache-2.0
5
+ base_model: AISA-Framework/AISA-AR-FunctionCall-FT
6
+ tags:
7
+ - function-calling
8
+ - arabic
9
+ - tool-use
10
+ - agentic
11
+ - gemma
12
+ - reasoning
13
+ - lora
14
+ - think
15
+ datasets:
16
+ - AISA-Framework/AISA-AR-FunctionCall
17
+ pipeline_tag: text-generation
18
+ library_name: transformers
19
+ ---
20
+
21
+ # AISA-AR-FunctionCall-Think
22
+
23
+ <p align="center">
24
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/628f7a71dd993507cfcbe587/21Mxl67VW-RQFiXTnvheT.png" width="700"/>
25
+ </p>
26
+
27
+ **Reasoning-Augmented Arabic Structured Tool Calling**
28
+
29
+ `AISA-AR-FunctionCall-Think` is a reasoning-enhanced variant of the Arabic function-calling model introduced in the **AISA-AR-FunctionCall** framework. The model generates an intermediate reasoning trace before invoking a tool, enabling transparent decision-making for Arabic agentic systems.
30
+
31
+ This model extends [AISA-AR-FunctionCall-FT](https://huggingface.co/AISA-Framework/AISA-AR-FunctionCall-FT) by introducing explicit reasoning supervision using `<think>` blocks prior to tool execution.
32
+
33
+ ---
34
+
35
+ ## Model Overview
36
+
37
+ | Field | Value |
38
+ |---|---|
39
+ | **Model name** | AISA-AR-FunctionCall-Think |
40
+ | **Base model** | AISA-AR-FunctionCall-FT |
41
+ | **Architecture** | Gemma 3 (FunctionGemma 270M) |
42
+ | **Training method** | LoRA reasoning fine-tuning |
43
+ | **Primary task** | Arabic reasoning-aware function calling |
44
+
45
+ The model produces outputs in the following pattern:
46
+
47
+ ```
48
+ <think>
49
+ reasoning about tool selection
50
+ </think>
51
+ <start_function_call>
52
+ call:tool_name{arguments}
53
+ </end_function_call>
54
+ ```
55
+
56
+ This allows the system to expose the reasoning behind tool selection.
57
+
58
+ ---
59
+
60
+ ## Key Capabilities
61
+
62
+ - Reasoning-aware tool selection
63
+ - Explicit decision traces for tool invocation
64
+ - Improved argument extraction consistency
65
+ - Interpretable structured execution
66
+
67
+ **Supported domains:**
68
+
69
+ | Domain |
70
+ |---|
71
+ | Travel |
72
+ | Utilities |
73
+ | Islamic services |
74
+ | Weather |
75
+ | Healthcare |
76
+ | Banking & finance |
77
+ | E-commerce |
78
+ | Government services |
79
+
80
+ **Supported Arabic dialect groups:**
81
+
82
+ - Modern Standard Arabic (MSA)
83
+ - Gulf
84
+ - Egyptian
85
+ - Levantine
86
+ - Maghrebi
87
+
88
+ ---
89
+
90
+ ## Training Dataset
91
+
92
+ Training uses a subset of the [AISA-AR-FunctionCall](https://huggingface.co/datasets/AISA-Framework/AISA-AR-FunctionCall) dataset with reasoning annotations.
93
+
94
+ | Property | Value |
95
+ |---|---|
96
+ | Dataset size | ~12k reasoning-augmented samples |
97
+ | Dialect coverage | 5 Arabic dialects |
98
+ | Domains | 8 real-world domains |
99
+ | Tools | 27 structured tools |
100
+
101
+ ---
102
+
103
+ ## Training Methodology
104
+
105
+ The reasoning model is trained by augmenting assistant outputs with explicit reasoning segments.
106
+
107
+ **Training format:**
108
+
109
+ ```
110
+ <think>
111
+ tool selection reasoning
112
+ </think>
113
+ <start_function_call>
114
+ call:tool{arguments}
115
+ </end_function_call>
116
+ ```
117
+
118
+ Reasoning supervision is enforced during inference by priming the model to begin its generation with `<think>`.
119
+
120
+ **Training configuration:**
121
+
122
+ | Parameter | Value |
123
+ |---|---|
124
+ | Training type | LoRA fine-tuning |
125
+ | LoRA rank | 64 |
126
+ | Alpha | 64 |
127
+ | Dropout | 0.05 |
128
+ | Trainable parameters | ~5.36% |
129
+ | Epochs | 3 |
130
+ | Learning rate | 3e-6 |
131
+ | Effective batch size | 32 |
132
+ | Optimizer | 8-bit AdamW |
133
+ | Scheduler | Cosine |
134
+
135
+ Additional training signals include **negative tool examples** to reduce hallucinated tool calls when no tool invocation is required.
136
+
137
+ ---
138
+
139
+ ## Evaluation Results
140
+
141
+ Evaluation is performed on a strict reasoning evaluation subset.
142
+
143
+ ### Strict Evaluation (n = 240)
144
+
145
+ | Metric | Score |
146
+ |---|---|
147
+ | Tool Call Rate | 0.992 |
148
+ | Think-Before-Call Rate | **1.000** |
149
+ | Function Name Accuracy | 0.992 |
150
+ | Argument F1 | **1.000** |
151
+ | Decision Accuracy | 0.992 |
152
+ | Hallucination Rate | **0.000** |
153
+
154
+ These results indicate that the model consistently performs reasoning before tool invocation and achieves near-perfect structured alignment within the evaluated subset.
155
+
156
+ ### Important Note on Format Validation
157
+
158
+ Standard function-call validators may classify reasoning outputs as **parse failures** because `<think>` tokens appear before the function call marker.
159
+
160
+ This does **not** indicate structural instability — it reflects a difference in serialization format. When reasoning segments are permitted, tool invocation correctness remains near-perfect.
161
+
162
+ ---
163
+
164
+ ## Example Usage
165
+
166
+ **User query:**
167
+
168
+ ```
169
+ ما حالة الطقس في الرياض اليوم؟
170
+ ```
171
+
172
+ **Model output:**
173
+
174
+ ```
175
+ <think>
176
+ المستخدم يريد معرفة حالة الطقس في مدينة الرياض، لذا يجب استخدام أداة get_weather.
177
+ </think>
178
+ <start_function_call>
179
+ call:get_weather{city:<escape>الرياض<escape>,days:1}
180
+ </end_function_call>
181
+ ```
182
+
183
+ ---
184
+
185
+ ## Intended Use
186
+
187
+ This model is intended for:
188
+
189
+ - Research on reasoning-aware tool calling
190
+ - Interpretable agent systems
191
+ - Arabic reasoning supervision experiments
192
+ - Debugging tool selection behavior
193
+
194
+ ### Production Recommendation
195
+
196
+ This model is an **exploratory research variant**. For production deployment, we recommend using:
197
+
198
+ [AISA-AR-FunctionCall-FT](https://huggingface.co/AISA-Framework/AISA-AR-FunctionCall-FT)
199
+
200
+ ---
201
+
202
+ ## Related Resources
203
+
204
+ | Resource | Link |
205
+ |---|---|
206
+ | Dataset | [AISA-Framework/AISA-AR-FunctionCall](https://huggingface.co/datasets/AISA-Framework/AISA-AR-FunctionCall) |
207
+ | Production model | [AISA-AR-FunctionCall-FT](https://huggingface.co/AISA-Framework/AISA-AR-FunctionCall-FT) |
208
+ | Model collection | [AISA Arabic FunctionCall](https://huggingface.co/collections/AISA-Framework/aisa-arabic-functioncall-datasets-and-models) |
209
+
210
+ ---
211
+
212
+ ## Paper
213
+
214
+ **From Language to Action in Arabic: Reliable Structured Tool Calling via Data-Centric Fine-Tuning**
215
+
216
+ *AISA Framework*
217
+
218
+ ---
219
+
220
+ ## AISA Framework
221
+
222
+ This model is part of the **AISA** (Agentic AI Systems Architecture) initiative for building reliable multilingual AI agents.
223
+
224
+ ---
225
+
226
+ ## License
227
+
228
+ [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)