We've added our own fine-tuned model, PelicanVLM-FC into BFCL_v4. We've also updated the model configuration based on Qwen2.5-VL-72B.
Our own testing results indicate that we'd rank 33th on the list.
Key Metrics
Detailed Score Breakdown
Overall Acc: 46.02%
Non-Live AST Acc (90.21% overall)
- Simple AST: 80.83%
- Multiple AST: 96.00%
- Parallel AST: 95.00%
- Parallel Multiple AST: 89.00%
Live Acc (79.35% overall)
- Live Simple AST: 79.46%
- Live Multiple AST: 79.49%
- Live Parallel AST: 75.00%
- Live Parallel Multiple AST: 75.00%
Multi Turn Acc (29.75% overall)
- Multi Turn Base: 39.00%
- Multi Turn Miss Func: 21.50%
- Multi Turn Miss Param: 26.00%
- Multi Turn Long Context: 32.50%
Web Search Acc (36.50% overall)
- Web Search Base: 51.00%
- Web Search no Snippet: 22.00%
Memory Acc (21.51% overall)
- Memory KV: 7.74%
- Memory Vector: 17.42%
- Memory Recursive Summarization: 39.35%
- Relevance Detection: 61.11%
- Irrelevance Detection: 87.38%
- Downloads last month
- 2
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support