good - a AustinOS Collection

Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

AustinOS 's Collections

good

updated about 1 month ago

Running

Featured

101

CUGA Agent

🤖

101

Configurable Generalist Agent, leader in AppWorld Benchmark
MiroEval: Benchmarking Multimodal Deep Research Agents in Process and Outcome

Paper • 2603.28407 • Published Mar 30 • 70
How Well Do Agentic Skills Work in the Wild: Benchmarking LLM Skill Usage in Realistic Settings

Paper • 2604.04323 • Published Apr 6 • 41

Collection guide
Browse collections

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs