sumitagrawal commited on
Commit
30426dd
·
verified ·
1 Parent(s): c2ee37e

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +1 -10
README.md CHANGED
@@ -31,16 +31,7 @@ specialized for **general tool/function calling**.
31
 
32
  ![Benchmark Results](benchmark-results.png)
33
 
34
- Evaluated using [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) on 100 held-out general function-calling examples:
35
-
36
- | Metric | Base | Fine-tuned | Delta |
37
- |--------|------|-----------|-------|
38
- | Tool Selection Acc | 49.0% | 78.0% | **+29.0%** |
39
- | First Tool Acc | 49.0% | 88.0% | **+39.0%** |
40
- | Negative Rejection | 100.0% | 100.0% | +0.0% |
41
- | Param Accuracy | 49.0% | 68.9% | **+19.9%** |
42
-
43
- End-to-end through the [tool agent](https://github.com/tech-sumit/tool-agent) pipeline: **14% → 57%** tool selection accuracy on a 7-query evaluation.
44
 
45
  ## Training
46
 
 
31
 
32
  ![Benchmark Results](benchmark-results.png)
33
 
34
+ Evaluated using [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) on 100 held-out general function-calling examples. End-to-end through the [tool agent](https://github.com/tech-sumit/tool-agent) pipeline: **14% → 57%** tool selection accuracy on a 7-query evaluation.
 
 
 
 
 
 
 
 
 
35
 
36
  ## Training
37