Spaces:

aec-bench
/

README

Running

README / index.html

Replace default static org card page

eb7fe3d verified about 2 months ago

1.62 kB

	<!-- ABOUTME: Static Hugging Face organisation card content for the aec-bench profile. -->
	<!-- ABOUTME: Presents project context and links without requiring a running Space app. -->
	<!doctype html>
	<html lang="en">
	<head>
	<meta charset="utf-8" />
	<meta name="viewport" content="width=device-width, initial-scale=1" />
	<title>AEC-Bench</title>
	<link rel="stylesheet" href="style.css" />
	</head>
	<body>
	<main>
	<h1>AEC-Bench</h1>
	<p>
	<a href="https://github.com/TheodoreGalanos/aec-bench" target="_blank" rel="noreferrer">AEC-Bench</a>
	is an open benchmark and Python toolkit for evaluating agentic AI systems on realistic
	Architecture, Engineering, and Construction tasks.
	</p>
	<p>
	The project combines generated engineering tasks, executable verifiers, model rollout ledgers,
	and trace artifacts so evaluation can be inspected beyond a single leaderboard score: by task
	family, difficulty, information visibility, tool use, cost, and failure mode.
	</p>
	<nav aria-label="AEC-Bench links">
	<a href="https://arxiv.org/abs/2603.29199" target="_blank" rel="noreferrer">Paper</a>
	<a href="https://github.com/TheodoreGalanos/aec-bench" target="_blank" rel="noreferrer">Code</a>
	<a href="https://huggingface.co/datasets/aec-bench/release-model-rollouts" target="_blank" rel="noreferrer">Release dataset</a>
	</nav>
	<p class="footer">
	This organisation hosts datasets, rollout artifacts, and benchmark releases for the AEC-Bench project.
	</p>
	</main>
	</body>
	</html>