Vibes Bench Baseline

This repository is used for the black-box optimization loop for the "vibes bench".

Baseline Model

v1.0-baseline: Qwen/Qwen3.5-9B — 9.6B parameter instruct model with native thinking/reasoning support.

This is the starting point. Subsequent commits will be fine-tuned iterations optimized purely via scalar feedback from the hidden benchmark.

Iteration	Description	Score
v1.0	Qwen3.5-9B baseline (no fine-tuning)	TBD

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support