Spaces:

jdsb06
/

meta-r2

Sleeping

App Files Files Community

meta-r2 / docs /scripts.md

github-actions[bot]

Deploy Space snapshot

ddbc1ba about 1 month ago

preview code

raw

history blame contribute delete

2.61 kB

Scripts reference

Directory: scripts/

`scripts/train_trl.py` — main training

Full GRPO training — single-step curriculum and episodic curriculum. See train_trl.md for complete reference.

LIFESTACK_NO_UNSLOTH=1 python scripts/train_trl.py            # full 5-stage curriculum
python scripts/train_trl.py --dry-run                         # pipeline check, CPU OK
python scripts/train_trl.py --resume                          # resume from checkpoint
python scripts/train_trl.py --episode-train                   # episodic mode
python scripts/train_trl.py --push-to-hub --hub-model-id ...  # upload after training

`scripts/eval.py` — random baseline

Runs a uniform random policy for N episodes and reports mean/std reward. No trained model, no GPU, no API key needed. Use this to establish a reward floor before GRPO runs or to verify env correctness after code changes.

python scripts/eval.py                              # 10 episodes, all domains
python scripts/eval.py --episodes 20 --domain flight_crisis
python scripts/eval.py --episodes 5 --verbose       # per-step output

`scripts/plot_training.py` — plot generation

Parses training logs (train_run_v*.log) and generates matplotlib plots: reward curve, loss curve, per-component reward breakdown, and 4-panel summary. Supports multiple log formats.

python scripts/plot_training.py --log train_run_v4.log --output-dir plots/

`scripts/smoke_test.py`

Fast pipeline check: import validation + one reset() + one step(). No GPU, no downloads.

python scripts/smoke_test.py

Also called by setup.sh at the end of the install process.

`scripts/upload_hf_model_cards.py`

Uploads model cards and training artifacts to HuggingFace model repositories:

docs/HF_MODEL_CARD_V4.md → jdsb06/lifestack-grpo-v4/README.md
docs/HF_MODEL_CARD_V1.md → jdsb06/lifestack-grpo/README.md
train_run_v1.log → jdsb06/lifestack-grpo/
Relevant plots → jdsb06/lifestack-grpo-v4/plots/

python scripts/upload_hf_model_cards.py   # requires HF_TOKEN or huggingface-cli login

`scripts/run_episode.py`

Runs one full episode with the LifeStackAgent (GRPO model or Groq API fallback). Requires credentials configured.

python scripts/run_episode.py
python scripts/run_episode.py --difficulty 3 --verbose

Related files

docs/train_trl.md — full train_trl.py reference
docs/eval.md — eval.py reference
docs/training_guide.md — end-to-end training guide

Scripts reference

scripts/train_trl.py — main training

scripts/eval.py — random baseline

scripts/plot_training.py — plot generation

scripts/smoke_test.py

scripts/upload_hf_model_cards.py

scripts/run_episode.py