Instructions to use AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m", dtype="auto") - llama-cpp-python
How to use AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m", filename="functiongemma_arabic_v5_Q4_K_M.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m:Q4_K_M # Run inference directly in the terminal: llama-cli -hf AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m:Q4_K_M # Run inference directly in the terminal: llama-cli -hf AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m:Q4_K_M
Use Docker
docker model run hf.co/AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m:Q4_K_M
- SGLang
How to use AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Ollama
How to use AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m with Ollama:
ollama run hf.co/AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m:Q4_K_M
- Unsloth Studio new
How to use AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m to start chatting
- Pi new
How to use AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m:Q4_K_M
Run Hermes
hermes
- Docker Model Runner
How to use AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m with Docker Model Runner:
docker model run hf.co/AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m:Q4_K_M
- Lemonade
How to use AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m:Q4_K_M
Run and chat with the model
lemonade run user.AISA-AR-FunctionCall-FT-q4_k_m-Q4_K_M
List all available models
lemonade list
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m:Q4_K_M# Run inference directly in the terminal:
llama-cli -hf AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m:Q4_K_MUse pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m:Q4_K_M# Run inference directly in the terminal:
./llama-cli -hf AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m:Q4_K_MBuild from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m:Q4_K_M# Run inference directly in the terminal:
./build/bin/llama-cli -hf AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m:Q4_K_MUse Docker
docker model run hf.co/AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m:Q4_K_MAISA-AR-FunctionCall-FT (Quantized Version 4 bit)
Reliable Arabic Structured Tool Calling via Data-Centric Fine-Tuning
AISA-AR-FunctionCall-FT is a fully fine-tuned Arabic function-calling model built on top of FunctionGemma (Gemma 3 270M) and optimized for structured tool invocation in Arabic agentic systems.
The model converts natural Arabic requests into structured executable API calls, enabling reliable integration between language models and external tools.
This model is part of the AISA (Agentic AI Systems Architecture) initiative.
Try the Model in Google Colab
You can run a full inference example using the notebook below.
The notebook demonstrates:
- Loading the model
- Defining tool schemas
- Generating structured tool calls
- Parsing function call outputs
Model Overview
| Field | Value |
|---|---|
| Model name | AISA-AR-FunctionCall-FT |
| Base model | unsloth/functiongemma-270m-it |
| Architecture | Gemma 3 (270M parameters) |
| Fine-tuning type | Full-parameter supervised fine-tuning |
| Primary task | Arabic function calling / tool invocation |
The model is designed to translate Arabic natural language requests into structured tool calls following the FunctionGemma tool-calling format.
Key Capabilities
- Arabic natural language → structured API calls
- Multi-dialect Arabic understanding
- Tool selection and argument extraction
- Structured execution environments
Supported domains:
| Domain |
|---|
| Travel |
| Utilities |
| Islamic services |
| Weather |
| Healthcare |
| Banking & finance |
| E-commerce |
| Government services |
Dataset
The model is trained on AISA-AR-FunctionCall — a production-ready Arabic function-calling dataset built through a rigorous data-centric pipeline:
- Dataset auditing
- Schema normalization
- Enum correction
- Tool pruning
- Prompt restructuring
- Tool sampling
Dataset splits:
| Split | Samples |
|---|---|
| Train | 41,104 |
| Validation | 4,568 |
| Test | 5,079 |
Dataset includes:
- 5 Arabic dialects
- 8 real-world domains
- 27 tool schemas
- Structured tool-call annotations
Dataset: AISA-Framework/AISA-AR-FunctionCall
Training Methodology
The model was trained using a data-centric fine-tuning pipeline designed to stabilize structured execution.
Key pipeline steps:
- Structural dataset auditing
- Enum constraint repair
- Tool schema normalization
- Tool pruning (36 → 27 tools)
- Tool sampling to prevent prompt truncation
- FunctionGemma-compatible chat serialization
- Completion-only supervised fine-tuning
Training configuration:
| Parameter | Value |
|---|---|
| Model size | 270M |
| Training type | Full fine-tuning |
| Epochs | 2 |
| Effective batch size | 32 |
| Learning rate | 2e-5 |
| Optimizer | 8-bit AdamW |
| Scheduler | Cosine |
| Precision | BF16 |
| Gradient checkpointing | Enabled |
Evaluation Results
Evaluation was performed on a held-out test set of 5,079 samples.
Clean Positive Evaluation (n = 2,873)
| Metric | Baseline | AISA-AR-FunctionCall-FT |
|---|---|---|
| Function Name Accuracy | 0.0804 | 0.6547 |
| Full Tool-Call Match | 0.0056 | 0.3362 |
| Argument Key F1 | 0.0600 | 0.5728 |
| Argument Exact Match | 0.0422 | 0.6377 |
| Parse Failure Rate | 0.8726 | 0.0084 |
| Format Validity | 0.1274 | 0.9916 |
| Hallucination Rate | 0.0003 | 0.0226 |
Key improvement: Parse failure reduced from 87% → <1%
Dialect Performance
| Dialect | Function Accuracy |
|---|---|
| MSA | 0.761 |
| Gulf | 0.697 |
| Egyptian | 0.683 |
| Levantine | 0.694 |
| Maghrebi | 0.616 |
Fine-tuning significantly reduces dialect disparity compared to the baseline model.
Known Limitations
Remaining errors are primarily semantic, including:
- Tool selection ambiguity
- Argument mismatches
- Domain overlap (e.g., weather vs. air quality)
Structured formatting errors are largely eliminated.
Example Usage
Prompt:
ما حالة الطقس في الرياض اليوم؟
Model output:
<start_function_call>
call:get_weather{
city:<escape>الرياض<escape>,
days:1
}
<end_function_call>
The structured call can then be executed by the application runtime.
Intended Use
This model is designed for:
- Arabic AI assistants
- Tool-based agents
- Structured API orchestration
- Arabic enterprise automation
- Research on multilingual tool calling
Out-of-Scope Uses
This model is not designed for:
- General chatbots or open-ended conversation
- Sensitive decision-making systems
- Safety-critical deployments without additional validation
Related Models
| Model | Description |
|---|---|
| AISA-AR-FunctionCall-Think | Reasoning-augmented tool-calling model |
AISA Framework
This model is part of the AISA initiative for building reliable agentic AI systems.
Model collection: AISA-Framework/aisa-arabic-functioncall-datasets-and-models
License
- Downloads last month
- 12
4-bit
Model tree for AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m
Base model
google/functiongemma-270m-it
Install from brew
# Start a local OpenAI-compatible server with a web UI: llama-server -hf AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m:Q4_K_M# Run inference directly in the terminal: llama-cli -hf AISA-Framework/AISA-AR-FunctionCall-FT-q4_k_m:Q4_K_M