README / index.html
TGalanos's picture
Replace default static org card page
eb7fe3d verified
<!-- ABOUTME: Static Hugging Face organisation card content for the aec-bench profile. -->
<!-- ABOUTME: Presents project context and links without requiring a running Space app. -->
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>AEC-Bench</title>
<link rel="stylesheet" href="style.css" />
</head>
<body>
<main>
<h1>AEC-Bench</h1>
<p>
<a href="https://github.com/TheodoreGalanos/aec-bench" target="_blank" rel="noreferrer">AEC-Bench</a>
is an open benchmark and Python toolkit for evaluating agentic AI systems on realistic
Architecture, Engineering, and Construction tasks.
</p>
<p>
The project combines generated engineering tasks, executable verifiers, model rollout ledgers,
and trace artifacts so evaluation can be inspected beyond a single leaderboard score: by task
family, difficulty, information visibility, tool use, cost, and failure mode.
</p>
<nav aria-label="AEC-Bench links">
<a href="https://arxiv.org/abs/2603.29199" target="_blank" rel="noreferrer">Paper</a>
<a href="https://github.com/TheodoreGalanos/aec-bench" target="_blank" rel="noreferrer">Code</a>
<a href="https://huggingface.co/datasets/aec-bench/release-model-rollouts" target="_blank" rel="noreferrer">Release dataset</a>
</nav>
<p class="footer">
This organisation hosts datasets, rollout artifacts, and benchmark releases for the AEC-Bench project.
</p>
</main>
</body>
</html>