File size: 1,509 Bytes
2912983
97affc1
566d03e
2912983
566d03e
2912983
566d03e
2912983
 
 
 
566d03e
 
 
 
2912983
 
566d03e
 
 
 
 
 
 
 
 
 
 
 
 
2912983
a5f8ac7
 
 
 
 
566d03e
2912983
a5f8ac7
 
de63c9e
566d03e
 
 
2912983
a5f8ac7
566d03e
a5f8ac7
566d03e
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
---
title: Arabic Function Calling Leaderboard
emoji: 🏆
colorFrom: green
colorTo: blue
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: true
license: apache-2.0
tags:
  - arabic
  - function-calling
  - leaderboard
  - llm-evaluation
---

# 🏆 Arabic Function Calling Leaderboard

لوحة تقييم استدعاء الدوال بالعربية

## Overview

The **Arabic Function Calling Leaderboard (AFCL)** evaluates Large Language Models on their ability to:

1. Understand Arabic queries (MSA + Dialects)
2. Select appropriate functions from available options
3. Extract correct arguments from Arabic text
4. Handle parallel and complex function calls
5. Detect when no function should be called

## Models Evaluated

- **Arabic-Native**: Jais, ALLaM, SILMA, AceGPT
- **Multilingual**: Qwen, Llama, Gemma, Mistral, Phi, BLOOMZ, Aya

## Dataset

📊 **Dataset**: [HeshamHaroon/Arabic_Function_Calling](https://huggingface.co/datasets/HeshamHaroon/Arabic_Function_Calling)

- **1,470 total samples** across 10 categories
- Simple, Multiple, Parallel, Parallel Multiple
- Irrelevance Detection
- Dialect Handling (Egyptian, Gulf, Levantine)

## Evaluation

The leaderboard automatically evaluates models using the HuggingFace Inference API when the Space starts.

## Citation

```bibtex
@misc{afcl2024,
    title={Arabic Function Calling Leaderboard},
    author={Hesham Haroon},
    year={2024},
    url={https://huggingface.co/spaces/HeshamHaroon/Arabic-Function-Calling-Leaderboard}
}
```