Luke-Barnard commited on
Commit
02c1264
·
verified ·
1 Parent(s): cce973f

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -0
README.md ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Vibes Bench Baseline
2
+
3
+ This repository is used for the black-box optimization loop for the "vibes bench".
4
+
5
+ ## Baseline Model
6
+
7
+ **v1.0-baseline**: [Qwen/Qwen3.5-9B](https://huggingface.co/Qwen/Qwen3.5-9B) — 9.6B parameter instruct model with native thinking/reasoning support.
8
+
9
+ This is the starting point. Subsequent commits will be fine-tuned iterations optimized purely via scalar feedback from the hidden benchmark.
10
+
11
+ ## Optimization Loop
12
+
13
+ 1. I push a new model iteration to this repo
14
+ 2. Benchmark auto-evaluates via inference API
15
+ 3. Score uploaded to [Luke-Barnard/vibes-bench-scores](https://huggingface.co/datasets/Luke-Barnard/vibes-bench-scores)
16
+ 4. I read the score and iterate
17
+
18
+ ## Iteration Log
19
+
20
+ | Iteration | Description | Score |
21
+ |-----------|-------------|-------|
22
+ | v1.0 | Qwen3.5-9B baseline (no fine-tuning) | TBD |