We've added our own fine-tuned model, PelicanVLM-FC into BFCL_v4. We've also updated the model configuration based on Qwen2.5-VL-72B.

Our own testing results indicate that we'd rank 33th on the list.

Key Metrics

Detailed Score Breakdown

Overall Acc: 46.02%

  1. Non-Live AST Acc (90.21% overall)

    • Simple AST: 80.83%
    • Multiple AST: 96.00%
    • Parallel AST: 95.00%
    • Parallel Multiple AST: 89.00%
  2. Live Acc (79.35% overall)

    • Live Simple AST: 79.46%
    • Live Multiple AST: 79.49%
    • Live Parallel AST: 75.00%
    • Live Parallel Multiple AST: 75.00%
  3. Multi Turn Acc (29.75% overall)

    • Multi Turn Base: 39.00%
    • Multi Turn Miss Func: 21.50%
    • Multi Turn Miss Param: 26.00%
    • Multi Turn Long Context: 32.50%
  4. Web Search Acc (36.50% overall)

    • Web Search Base: 51.00%
    • Web Search no Snippet: 22.00%
  5. Memory Acc (21.51% overall)

    • Memory KV: 7.74%
    • Memory Vector: 17.42%
    • Memory Recursive Summarization: 39.35%
  • Relevance Detection: 61.11%
  • Irrelevance Detection: 87.38%
Downloads last month
2
Safetensors
Model size
73B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support