AI & ML interests

None defined yet.

Recent Activity

klieret  published a dataset 1 day ago
programbench/ProgramBench-Tests
klieret  updated a dataset 2 days ago
programbench/ProgramBench-Tests
john-b-yang  updated a Space 3 days ago
programbench/README
View all activity

Organization Card

ProgramBench: Can Language Models Rebuild Programs From Scratch?

Given only a compiled binary and its documentation, AI agents must architect and implement a complete codebase that reproduces the original program's behavior. ProgramBench evaluates this capability across 200 real-world open-source projects spanning Rust, Go, C, C++, Haskell, and Java.

Links

models 0

None public yet