Commit History

Cap prompt generation at 512 tokens and add version print
ee71a24
unverified

Claude commited on

Add SFT warm start before GRPO and DB connectivity init check
c2dc160
unverified

Claude commited on

Add local model inference backend for Layer 2
10418d0
unverified

Claude commited on

Make Supabase uploads incremental — upload after every step
76f180f
unverified

Claude commited on

Add Supabase upload for training results (Storage + DB)
28bcb40
unverified

Claude commited on

Add raw training summary output and adjust training scale
71b0977
unverified

Claude commited on

Add volume verification, fsync, and stdout fallback for training outputs
f703ff1
unverified

Claude commited on

Clean up dead code, unused imports, and move hardcoded values to config.yaml
3dc48b7
unverified

Claude commited on

Add --llm-agent and other legacy CLI flags for backwards compatibility
03d9529
unverified

Claude commited on

Centralize all training params in config.yaml (single source of truth)
4e2b74e
unverified

Claude commited on

Remove mock mode: only real GRPO RL training remains
288d9a2
unverified

Claude commited on

Add clear training progress logging with technical + domain names
4b89b89
unverified

Claude commited on

Update docstrings to reflect LLM-only training pipeline
01518e0
unverified

Claude commited on

Align GRPOConfig defaults with CLI: 10 steps, 7 episodes
ca36c02
unverified

Claude commited on

Remove all rule-based fallback systems, require LLM inference
21da591
unverified

Claude commited on

Reduce training defaults for fast iteration: steps=10, episodes=7
b1d7ca2
unverified

Claude commited on

Add training report & logging system with reward charts and conversation comparisons
506d641
unverified

Claude commited on

Wire up real LLM integration via HF Inference API
4ac72af
unverified

Claude commited on

Fix critical gaps: prompt-sensitive agent, adversarial customers, executable GRPO, OpenEnv wrapper
b259333
unverified

Claude commited on