Scripts reference
Directory: scripts/
scripts/train_trl.py β main training
Full GRPO training β single-step curriculum and episodic curriculum. See train_trl.md for complete reference.
LIFESTACK_NO_UNSLOTH=1 python scripts/train_trl.py # full 5-stage curriculum
python scripts/train_trl.py --dry-run # pipeline check, CPU OK
python scripts/train_trl.py --resume # resume from checkpoint
python scripts/train_trl.py --episode-train # episodic mode
python scripts/train_trl.py --push-to-hub --hub-model-id ... # upload after training
scripts/eval.py β random baseline
Runs a uniform random policy for N episodes and reports mean/std reward. No trained model, no GPU, no API key needed. Use this to establish a reward floor before GRPO runs or to verify env correctness after code changes.
python scripts/eval.py # 10 episodes, all domains
python scripts/eval.py --episodes 20 --domain flight_crisis
python scripts/eval.py --episodes 5 --verbose # per-step output
scripts/plot_training.py β plot generation
Parses training logs (train_run_v*.log) and generates matplotlib plots: reward curve, loss curve, per-component reward breakdown, and 4-panel summary. Supports multiple log formats.
python scripts/plot_training.py --log train_run_v4.log --output-dir plots/
scripts/smoke_test.py
Fast pipeline check: import validation + one reset() + one step(). No GPU, no downloads.
python scripts/smoke_test.py
Also called by setup.sh at the end of the install process.
scripts/upload_hf_model_cards.py
Uploads model cards and training artifacts to HuggingFace model repositories:
docs/HF_MODEL_CARD_V4.mdβjdsb06/lifestack-grpo-v4/README.mddocs/HF_MODEL_CARD_V1.mdβjdsb06/lifestack-grpo/README.mdtrain_run_v1.logβjdsb06/lifestack-grpo/- Relevant plots β
jdsb06/lifestack-grpo-v4/plots/
python scripts/upload_hf_model_cards.py # requires HF_TOKEN or huggingface-cli login
scripts/run_episode.py
Runs one full episode with the LifeStackAgent (GRPO model or Groq API fallback). Requires credentials configured.
python scripts/run_episode.py
python scripts/run_episode.py --difficulty 3 --verbose
Related files
docs/train_trl.mdβ full train_trl.py referencedocs/eval.mdβ eval.py referencedocs/training_guide.mdβ end-to-end training guide