Luke-Barnard
/

vibes-bench-model

Model card Files Files and versions

Luke-Barnard commited on Apr 22

Commit

02c1264

·

verified ·

1 Parent(s): cce973f

Upload README.md

Files changed (1) hide show

README.md +22 -0

README.md ADDED Viewed

	@@ -0,0 +1,22 @@

+# Vibes Bench Baseline
+This repository is used for the black-box optimization loop for the "vibes bench".
+## Baseline Model
+**v1.0-baseline**: [Qwen/Qwen3.5-9B](https://huggingface.co/Qwen/Qwen3.5-9B) — 9.6B parameter instruct model with native thinking/reasoning support.
+This is the starting point. Subsequent commits will be fine-tuned iterations optimized purely via scalar feedback from the hidden benchmark.
+## Optimization Loop
+1. I push a new model iteration to this repo
+2. Benchmark auto-evaluates via inference API
+3. Score uploaded to [Luke-Barnard/vibes-bench-scores](https://huggingface.co/datasets/Luke-Barnard/vibes-bench-scores)
+4. I read the score and iterate
+## Iteration Log
+| Iteration | Description | Score |
+|-----------|-------------|-------|
+| v1.0 | Qwen3.5-9B baseline (no fine-tuning) | TBD |