lean-laguna / results /baseline.json
art87able's picture
Lean Laguna: Laguna XS.2 + DFlash — lossless single-GPU speedup + cheaper RL rollouts
8612587
{
"label": "baseline",
"model": "poolside/Laguna-XS.2",
"n": 14,
"tokens_per_s_mean": 19.64077204940069,
"ttft_s_mean": 6.58612985270364,
"acceptance_length_tau": 1.0,
"source": "HF Job 6a19d8b73a4b8cae6044dfdf (h200), 2026-05-29; vLLM 0.22.0, --enforce-eager, --max-model-len 4096, greedy (temperature=0), no speculator",
"prompt_set": "14 distinct mixed-difficulty Python prompts (trivial fib/is_prime -> medium binary_search/roman_to_int -> hard lcs/parse_duration/dijkstra/LRUCache)",
"corroborating_run": "An earlier 20-prompt trivial-only run (job 6a19d2105c8d10ffa1107774) gave baseline 19.47 tok/s.",
"note": "ttft_s_mean here is full-completion latency, NOT true time-to-first-token; we make no TTFT claim. Summary stats are over all n=14."
}