RiddleHe commited on
Commit
23307ba
·
1 Parent(s): 54383e8

Add YC-Bench evaluation results (avg $408,822)

Browse files

YC-Bench medium preset evaluation across seeds 1, 2, 3.

Score: $408,822 average final funds (USD).

Benchmark: https://huggingface.co/datasets/collinear-ai/yc-bench
Source: https://github.com/collinear-ai/yc-bench

Files changed (1) hide show
  1. .eval_results/yc-bench.yaml +9 -0
.eval_results/yc-bench.yaml ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ - dataset:
2
+ id: collinear-ai/yc-bench
3
+ task_id: medium
4
+ value: 408822
5
+ date: "2026-03-24"
6
+ source:
7
+ url: https://github.com/collinear-ai/yc-bench
8
+ name: "YC-Bench eval"
9
+ notes: "avg final funds (USD) across seeds 1,2,3. Kimi K2.5 (via OpenRouter moonshotai/kimi-k2.5)"