sumitagrawal
/

functiongemma-270m-tool-agent

@@ -31,16 +31,7 @@ specialized for **general tool/function calling**.
 ![Benchmark Results](benchmark-results.png)
-Evaluated using [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) on 100 held-out general function-calling examples:
-| Metric | Base | Fine-tuned | Delta |
-|--------|------|-----------|-------|
-| Tool Selection Acc | 49.0% | 78.0% | **+29.0%** |
-| First Tool Acc | 49.0% | 88.0% | **+39.0%** |
-| Negative Rejection | 100.0% | 100.0% | +0.0% |
-| Param Accuracy | 49.0% | 68.9% | **+19.9%** |
-End-to-end through the [tool agent](https://github.com/tech-sumit/tool-agent) pipeline: **14% → 57%** tool selection accuracy on a 7-query evaluation.
 ## Training

 ![Benchmark Results](benchmark-results.png)
+Evaluated using [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) on 100 held-out general function-calling examples. End-to-end through the [tool agent](https://github.com/tech-sumit/tool-agent) pipeline: **14% → 57%** tool selection accuracy on a 7-query evaluation.
 ## Training