Update README.md
Browse files
README.md
CHANGED
|
@@ -2,31 +2,24 @@
|
|
| 2 |
license: cc-by-4.0
|
| 3 |
datasets:
|
| 4 |
- Salesforce/xlam-function-calling-60k
|
|
|
|
| 5 |
base_model: Qwen/Qwen2-7B-Instruct
|
| 6 |
---
|
| 7 |
# Hammer-7b Function Calling Model
|
| 8 |
|
| 9 |
## <font color=red>\[Updates!!!\]</font> Hammer 2.0 Series have been Published
|
| 10 |
|
| 11 |
-
We're excited to
|
| 12 |
-
[0.5B](https://huggingface.co/MadeAgents/Hammer2.0-0.5b),
|
| 13 |
-
[1.5B](https://huggingface.co/MadeAgents/Hammer2.0-1.5b),
|
| 14 |
-
[4B](https://huggingface.co/MadeAgents/Hammer2.0-3b), and [7B](https://huggingface.co/MadeAgents/Hammer2.0-0.5b).
|
| 15 |
-
|
| 16 |
|
| 17 |
|
| 18 |
## Introduction
|
| 19 |
**Hammer** is a series of cutting-edge Large Language Models (LLMs) crafted to boost the critical capability of AI agents: function calling. Differing from existing models focusing on training data refinement, Hammer optimizes performance primarily through advanced training techniques. Focusing on on-device applications, we release a number of models from [1.5B](https://huggingface.co/MadeAgents/Hammer-1.5b), [4B](https://huggingface.co/MadeAgents/Hammer-4b) to [7B](https://huggingface.co/MadeAgents/Hammer-7b) parameters.
|
| 20 |
|
| 21 |
## Model Details
|
| 22 |
-
|
| 23 |
-
|
| 24 |
-
## Tuning Details
|
| 25 |
-
A report with all the technical details leading to our models has been published at "[Hammer: Robust Function-Calling for On-Device Language Models via Function Masking](https://arxiv.org/abs/2410.04587)". All the code for data process, model tuning, and evaluation will also be open-sourced very soon.
|
| 26 |
-
|
| 27 |
|
| 28 |
## Evaluation
|
| 29 |
-
First, we evaluate Hammer series on the Berkeley Function-Calling Leaderboard (BFCL):
|
| 30 |
|
| 31 |
<div style="text-align: center;">
|
| 32 |
<img src="figures/bfcl.PNG" alt="overview" width="1480" style="margin: auto;">
|
|
@@ -34,14 +27,13 @@ First, we evaluate Hammer series on the Berkeley Function-Calling Leaderboard (B
|
|
| 34 |
|
| 35 |
The above table indicates that within the BFCL framework, our Hammer series consistently achieves corresponding sota performance at comparable scales, particularly Hammer-7B, whose overall performance ranks second only to the proprietary GPT-4.
|
| 36 |
|
| 37 |
-
|
| 38 |
In addition, we evaluated our Hammer series (1.5b, 4b, 7b) on other academic benchmarks to further show our model's generalization ability:
|
| 39 |
|
| 40 |
<div style="text-align: center;">
|
| 41 |
<img src="figures/others.PNG" alt="overview" width="1000" style="margin: auto;">
|
| 42 |
</div>
|
| 43 |
|
| 44 |
-
|
| 45 |
|
| 46 |
## Requiements
|
| 47 |
The code of Hammer-7b has been in the latest Hugging face transformers and we advise you to install `transformers>=4.37.0`.
|
|
|
|
| 2 |
license: cc-by-4.0
|
| 3 |
datasets:
|
| 4 |
- Salesforce/xlam-function-calling-60k
|
| 5 |
+
- MadeAgents/xlam-irrelevance-7.5k
|
| 6 |
base_model: Qwen/Qwen2-7B-Instruct
|
| 7 |
---
|
| 8 |
# Hammer-7b Function Calling Model
|
| 9 |
|
| 10 |
## <font color=red>\[Updates!!!\]</font> Hammer 2.0 Series have been Published
|
| 11 |
|
| 12 |
+
We're excited to release lightweight Hammer 2.0 models ([0.5B](https://huggingface.co/MadeAgents/Hammer2.0-0.5b) , [1.5B](https://huggingface.co/MadeAgents/Hammer2.0-1.5b) , [3B](https://huggingface.co/MadeAgents/Hammer2.0-3b) , and [7B](https://huggingface.co/MadeAgents/Hammer2.0-7b)) with strong function calling capability, which empower developers to build personalized, on-device agentic applications.
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
|
| 14 |
|
| 15 |
## Introduction
|
| 16 |
**Hammer** is a series of cutting-edge Large Language Models (LLMs) crafted to boost the critical capability of AI agents: function calling. Differing from existing models focusing on training data refinement, Hammer optimizes performance primarily through advanced training techniques. Focusing on on-device applications, we release a number of models from [1.5B](https://huggingface.co/MadeAgents/Hammer-1.5b), [4B](https://huggingface.co/MadeAgents/Hammer-4b) to [7B](https://huggingface.co/MadeAgents/Hammer-7b) parameters.
|
| 17 |
|
| 18 |
## Model Details
|
| 19 |
+
Hammer2.0 finetuned based on [Qwen 2.0 series](https://huggingface.co/collections/Qwen/qwen2-6659360b33528ced941e557f) using function masking techniques. It's trained using the [APIGen Function Calling Datasets](https://huggingface.co/datasets/Salesforce/xlam-function-calling-60k) containing 60,000 samples, supplemented by [xlam-irrelevance-7.5k](https://huggingface.co/datasets/MadeAgents/xlam-irrelevance-7.5k) we generated. Hammer has achieved exceptional performances across numerous function calling benchmarks. For more details, please refer to [Hammer: Robust Function-Calling for On-Device Language Models via Function Masking](https://arxiv.org/abs/2410.04587) and [Hammer GitHub repository](https://github.com/MadeAgents/Hammer).
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
|
| 21 |
## Evaluation
|
| 22 |
+
First, we evaluate Hammer series on the Berkeley Function-Calling Leaderboard (BFCL-v2):
|
| 23 |
|
| 24 |
<div style="text-align: center;">
|
| 25 |
<img src="figures/bfcl.PNG" alt="overview" width="1480" style="margin: auto;">
|
|
|
|
| 27 |
|
| 28 |
The above table indicates that within the BFCL framework, our Hammer series consistently achieves corresponding sota performance at comparable scales, particularly Hammer-7B, whose overall performance ranks second only to the proprietary GPT-4.
|
| 29 |
|
|
|
|
| 30 |
In addition, we evaluated our Hammer series (1.5b, 4b, 7b) on other academic benchmarks to further show our model's generalization ability:
|
| 31 |
|
| 32 |
<div style="text-align: center;">
|
| 33 |
<img src="figures/others.PNG" alt="overview" width="1000" style="margin: auto;">
|
| 34 |
</div>
|
| 35 |
|
| 36 |
+
Hammer models showcase highly stable performance, suggesting the robustness of Hammer series. In contrast, the baseline approaches display varying levels of effectiveness.
|
| 37 |
|
| 38 |
## Requiements
|
| 39 |
The code of Hammer-7b has been in the latest Hugging face transformers and we advise you to install `transformers>=4.37.0`.
|