File size: 1,112 Bytes
d311bdb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# NOTICE

Lean Laguna is a **serving recipe + benchmark harness + reusable RL environment** built on top of
Poolside's released Laguna XS.2. It does **not** redistribute model weights.

## Built on
- **Laguna XS.2**`poolside/Laguna-XS.2` (Apache-2.0). The base model; not included here.
- **DFlash speculator**`poolside/Laguna-XS.2-speculator.dflash`. The 0.6B draft model used for
  speculative decoding; not included here. The Laguna DFlash *speculator checkpoint* was trained by Poolside.
- **DFlash method** — the speculative-decoding drafting method, integrated in vLLM via
  `--speculative-config '{"method":"dflash", ...}'`.

## What is original here (Apache-2.0)
The benchmark/serving harness (`scripts/`, `bench/`, `evals/`), the measured A/B results (`results/`),
the `spec_rl` verifiers environment (`spec_rl/`), and the endpoint/configuration seam (`configs/`).
Under greedy decoding the outputs are byte-identical to the base model — the speedup is lossless.

## Cite
Built for the Poolside Research Hackathon. See `README.md` for the method, measured results, and
reproduction steps.