# True LLM Learning Evaluation (Pre-RL vs Post-RL) This folder is for checkpoint-vs-checkpoint evidence: - pre-RL base model - post-RL trained checkpoint Both are evaluated with an identical protocol. ## Required environment variables - `BASELINE_MODEL_NAME` - `TRAINED_MODEL_PATH` (local directory with `adapter_config.json`) - `ENV_BASE_URL` (CommitmentOS HTTP API) Optional: - `HF_TOKEN` (gated Hub models / rate limits) Optional protocol overrides: - `EVAL_SEED` (default: `42`) - `EVAL_MAX_STEPS` (default: `12`) - `EVAL_TEMPERATURE` (default: `0.0`) - `EVAL_TOP_P` (default: `1.0`) - `EVAL_MAX_NEW_TOKENS` (default: `256`) - `EVAL_SUCCESS_THRESHOLD` (default: `0.6`) ## Run ```bash cd commitment_os pip install -e ".[llm-eval]" python3 evaluation/evaluate_llm_checkpoints.py python3 evaluation/plot_llm_checkpoints.py ``` The evaluator prints one line per task (`[eval …] task i/n`) so long Colab runs do not look frozen. ## After Colab Zip weights + artifacts for download (paths assume `/content/commitment_os`): ```bash cd /content/commitment_os && zip -r /content/commitment_os_bundle.zip training_output artifacts/evals_llm ``` Or copy `training_output/` and `artifacts/evals_llm/` to Google Drive if the zip is too large for the browser. These bundles are **not** checked into git (clone speed + history). A **~330MB** zip (weights + this folder) is a normal size: publish it as a **GitHub Release** asset, **HF Hub**, or **Google Drive**. **Drive (weights + this folder):** [commitment_os_bundle](https://drive.google.com/drive/folders/1yexZBSqyH7gWlTzYN5DlX3tXfPMmeVAK?usp=sharing) — after download you should have `artifacts/evals_llm/` (this layout) next to `training_output/`. See root **README** for `gdown` / `TRAINED_MODEL_PATH` notes. ## Expected outputs - `llm_eval_protocol.json` - `baseline_llm_eval.json` - `trained_llm_eval.json` - `llm_comparison.csv` - `llm_summary.json` - `llm_case_study_hard_015.md` - `llm_reward_by_task.svg` - `llm_violations_before_after.svg`