Useful tool from Modal

#11
by mkieffer - opened
Agents-MCP-Hackathon org

Cool app from Modal that benchmarks LLM latencies in various frameworks running on Modal:
https://modal.com/llm-almanac/advisor

This kind of latency benchmark is useful for agents too, especially once a tool call or routing decision fans out into multiple model calls.

One thing I would like to see in agent benchmarks is not only single-call latency, but end-to-end run shape:

  • number of model calls
  • number of tool calls
  • p50/p95 per step
  • retries/timeouts
  • final success rate
  • total cost per completed run

For MCP-heavy agents, the slow part is often the choreography around the model rather than the model call alone.

Sign up or log in to comment