--- name: HF before delete description: Always upload valuable data (ckpts, logs, predictions) to Hugging Face BEFORE deleting from local disk. Don't lose data to disk pressure. type: feedback originSessionId: 4037f43b-2133-46c6-84bd-02f7d454ec8b --- **Rule**: Before running `rm` / `git rm` / quarantine-delete on any artifact under `/workspace/dnathinker/` or `/shm/dnathinker_quarantine/`, **first upload to HF if it isn't already mirrored**. **Why**: User direction (2026-05-05) after I deleted Phase-8 RL through-MDLM `log.jsonl` files from `/shm/dnathinker_quarantine/` during cycle 50 cleanup. Those logs were the ONLY source for the `F6_rl_training_curves.pdf` paper figure. They were not on HF; deletion was irreversible. Result: F6 panel can't be regenerated and we lost reproducibility. **How to apply**: 1. Before any `rm` on `/workspace/dnathinker/runs/` or `/shm/dnathinker_quarantine/`, check `MANIFEST.tsv` for an `HF: /` annotation. 2. If absent, run `scripts/innovations/hf_auto_uploader.py --once` (or specifically `hf_upload_finished_models.py` for ckpts) and verify upload succeeded BEFORE deleting. 3. For run-dir cleanup: keep `log.jsonl`, `manifest.json`, `train.log`, `eval_*.log`, and any `*_score*.json/md` — they're tiny and used to regenerate paper figures. ckpts can be deleted if HF-mirrored. 4. Annotate the deletion in `/shm/dnathinker_quarantine/MANIFEST.tsv` with the HF reference. **Exceptions**: tmp files (`/tmp/*`, `_bench_logs` rotations explicitly marked stale, intermediate dataloader caches that are fast to regenerate) don't need HF round-trip.