State of Project
What works
- Question-first pipeline with JSONL chunk source by default; deterministic chunk IDs and JSONL caches for questions, generations, verifications, and rewards.
- Providers: Ollama/OpenAI/HTTP plus mock provider; mock pathway enables full pipeline tests without GPUs or ES.
- Verifier parsing tolerates distributor format (
SCOREas number orPASS/FAILwith noisy prefixes); caching and retry logic in place. - Tests: 42 passing (retrieval mock/real, generator, verifier, reward, pipeline behaviour, cache, full mock pipeline).
- CLI defaults: verbose on, question-first, JSONL chunks; chunk/question limits respected.
- Generator parsing/logging: preserves provider
.thinking(structured) and parsed thought/answer/confidence/evidence/limitations; verbose mode prints both parsed and raw (JSON pretty if parsable). Gold stores all generator fields.
What needs attention
- Real pipeline currently fails at question generation when Ollama/question model is unreachable; run requires a live Ollama with the specified model pulled.
- Reward is still mocked in the recent run; swap to real reward provider/model when available.
- Verifier prompt must stay distributor-provided; parsing is tolerant but malformed outputs still log raw text in verbose mode.
- Deprecation warning from
punycode(Node) shows during tests; benign but noisy.
Risks
- Long generator outputs can inflate verifier context; may need truncation or smaller verifier model to avoid context overruns.
- Cache growth: JSONL caches can grow large; add rotation/compaction if running many cycles.
- ES mode defaults to 100 chunk fetch if used; confirm chunk limits when switching from JSONL to ES.
Next steps (suggested)
- Pull and start question/answer/verifier/reward models on Ollama (or configure OpenAI/HTTP) and re-run a small batch (
--limit/--chunk-limit) to validate end-to-end with real models. - Add optional verifier retry when JSON parse fails (1 retry) and cap logged transcripts to reduce noise in verbose runs.
- Consider a cache inspection/cleanup script for
data/cache/*.jsonl.