YC-Bench: Can Your AI Agent Run a Startup Without Going Bankrupt?

Community Article Published April 2, 2026

TL;DR: We built a benchmark that makes LLMs run a simulated startup for a full year β€” hiring decisions, shady clients, tight deadlines, and all. Only 3 out of 12 frontier models turned a profit. Most went bankrupt. Here's what we learned.

1_leaderboard-1

If you like YC-Bench, make sure to give it a star on our repo and a heart on our leaderboard. Check-out Collinear's SimLab to improve your AI Agent on long-horizon capabilities!

Community

Sign up or log in to comment