ClawBench / README.md
AgPerry's picture
Add arxiv:2604.08523 tag for HF Papers auto-linking
50a75ee verified

A newer version of the Gradio SDK is available: 6.14.0

Upgrade
metadata
title: ClawBench Leaderboard
emoji: πŸ¦€
colorFrom: indigo
colorTo: purple
sdk: gradio
sdk_version: 5.15.0
app_file: app.py
pinned: true
license: apache-2.0
short_description: Can AI agents complete everyday online tasks?
tags:
  - arxiv:2604.08523
  - leaderboard
  - benchmark
  - web-agents
  - browser-automation
  - agent-evaluation
  - llm-evaluation

ClawBench β€” Leaderboard

Live results for the ClawBench web-agent benchmark β€” backed by leaderboard/results.csv in the dataset repo. Submit your model by opening a PR there.