Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -31,16 +31,7 @@ specialized for **general tool/function calling**.
|
|
| 31 |
|
| 32 |

|
| 33 |
|
| 34 |
-
Evaluated using [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) on 100 held-out general function-calling examples:
|
| 35 |
-
|
| 36 |
-
| Metric | Base | Fine-tuned | Delta |
|
| 37 |
-
|--------|------|-----------|-------|
|
| 38 |
-
| Tool Selection Acc | 49.0% | 78.0% | **+29.0%** |
|
| 39 |
-
| First Tool Acc | 49.0% | 88.0% | **+39.0%** |
|
| 40 |
-
| Negative Rejection | 100.0% | 100.0% | +0.0% |
|
| 41 |
-
| Param Accuracy | 49.0% | 68.9% | **+19.9%** |
|
| 42 |
-
|
| 43 |
-
End-to-end through the [tool agent](https://github.com/tech-sumit/tool-agent) pipeline: **14% → 57%** tool selection accuracy on a 7-query evaluation.
|
| 44 |
|
| 45 |
## Training
|
| 46 |
|
|
|
|
| 31 |
|
| 32 |

|
| 33 |
|
| 34 |
+
Evaluated using [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) on 100 held-out general function-calling examples. End-to-end through the [tool agent](https://github.com/tech-sumit/tool-agent) pipeline: **14% → 57%** tool selection accuracy on a 7-query evaluation.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
|
| 36 |
## Training
|
| 37 |
|