Spaces:
Paused
Paused
File size: 7,078 Bytes
84a3b72 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 | # MASTER CHECKLIST: WHAT NEEDS TO HAPPEN FOR PART 1
## Files Already Prepared (β Done)
| File | Purpose | Status |
|------|---------|--------|
| `PART1_QUICK_SUMMARY.md` | 1-page reference guide for venue | β READY |
| `PART1_DEVELOPMENT_TRAINING_CHECKLIST.md` | Detailed step-by-step instructions | β READY |
| `generate_curves.py` | Curve generation after training | β READY |
| `BLOG_POST_TEMPLATE.md` | Storytelling framework | β READY |
| `training/train.py` | Training script | β READY |
| `training/config.yaml` | Optimized config (1500 episodes) | β READY |
| `training/warmup_traces.jsonl` | SFT warmup data (20 examples) | β READY |
| `permanence/env.py` | Core environment | β READY |
---
## PART 1: DEVELOPMENT & TRAINING BREAKDOWN
### What Happens in PART 1
**At venue: 11:30 AM - 8:00 PM (8.5 hours)**
PART 1 is about **generating evidence that your environment actually teaches agents something.**
---
## WHAT YOU NEED TO DO (Concrete Tasks)
### PRE-VENUE (Before you leave today)
**Task 1.1: Verify repo is in good state**
```bash
cd c:\Users\Hp\OneDrive\Desktop\meta
git status # Should show nothing uncommitted
git log -1 # Last commit: "Add OpenEnv deployment files..."
```
Expected: No uncommitted changes, repo clean
**Task 1.2: Verify dependencies are specified**
```bash
cat pyproject.toml | grep -A 10 dependencies
```
Expected: Lists torch, transformers, trl, unsloth, datasets, peft
**Task 1.3: Verify training config is correct**
```bash
cat training/config.yaml
```
Expected: `total_episodes: 1500`, `group_size: 8`, `load_in_4bit: true`
---
### AT VENUE: PHASE 1 (11:30 AM - 12:00 PM) β GPU Setup
**Task 2.1: Get GPU access**
- Find venue staff
- Get SSH credentials or Colab link
- **CRITICAL:** Confirm GPU type (A100, RTX 4090, H100, etc.)
- If NO GPU: Escalate immediately to L2 mentor
**Task 2.2: Verify CUDA works**
```bash
python -c "import torch; print(torch.cuda.get_device_name(0)); print(f'{torch.cuda.get_device_properties(0).total_memory / 1e9:.0f}GB')"
```
Expected: Should print GPU name and memory (e.g., "A100" and "40GB")
**Task 2.3: Clone repo and install dependencies**
```bash
git clone https://github.com/chanikkyasaai/permanence
cd permanence
pip install -e .
pip install torch transformers trl unsloth datasets peft
```
Expected: No errors, all packages install successfully
**Task 2.4: Verify environment works**
```bash
python -c "from permanence.env import PermanenceEnv; print('β OK')"
```
Expected: Prints "β OK"
**By 12:00 PM: You should have GPU ready, repo cloned, dependencies installed, environment verified.**
---
### AT VENUE: PHASE 2 (12:00 PM - 7:30 PM) β Training Execution
**Task 3.1: START TRAINING (single command)**
```bash
python -m training.train --config training/config.yaml
```
**That's it. Press Enter. Training runs for 7 hours unattended.**
**What happens next:**
- Minutes 0-1: Model loading
- Minutes 1-3: Data loading
- Minutes 3-420: Training (1,500 episodes Γ ~0.17 min/episode)
- Every 100 episodes: Progress printed to console
- Output: `permanence_output/training_log.json` with all metrics
**You can relax, walk around, eat, prepare for Part 2. Just don't close the terminal.**
**Checkpoint:** Every 500 episodes, a checkpoint is saved. If it crashes at episode 1400, you can resume.
---
### AT VENUE: PHASE 3 (7:30 PM - 8:00 PM) β Post-Training Verification
**Task 4.1: Generate training curves**
```bash
python generate_curves.py
```
Expected: Creates `results/training_curves.png` (4-panel plot)
**Task 4.2: Verify curves look good**
- Open `results/training_curves.png`
- Check Panel 1 (Reward): Should trend **upward** (from negative to positive)
- Check Panel 2 (Loss): Should trend **downward** (convergence)
- Check Panel 3 (Catastrophe): Should trend **downward** (improvement)
- Check Panel 4 (Accuracy): Should trend **upward** (improvement)
If curves look wrong: Check training_log.json for errors
**Task 4.3: Verify model loads**
```bash
python -c "from transformers import AutoModelForCausalLM; m = AutoModelForCausalLM.from_pretrained('./permanence_output/final_model'); print('β Model loads')"
```
Expected: Prints "β Model loads"
**Task 4.4: Commit results**
```bash
git add permanence_output/training_log.json results/training_curves.png results/training_summary.txt
git commit -m "Training complete: 1500 episodes, reward improvement verified"
```
Expected: Commit succeeds, files tracked
**By 8:00 PM: You have training curves, metrics, and proof that the environment works.**
---
## DELIVERABLES AT END OF PART 1
By 8:00 PM, you will have:
```
permanence_output/
βββ training_log.json β 1,500 episodes of metrics
βββ final_model/ β Trained weights
β βββ pytorch_model.bin
βββ checkpoint_*
results/
βββ training_curves.png β β JUDGES WANT THIS
βββ training_summary.txt β Numerical metrics
βββ training_comparison.md
Git commits with all artifacts tracked
```
---
## SUCCESS CRITERIA FOR PART 1
β
You've completed PART 1 if:
- [ ] Training ran for 7 hours without crashing
- [ ] permanence_output/training_log.json exists with 1,500 episodes
- [ ] results/training_curves.png exists and shows improvement
- [ ] Reward curve trending upward
- [ ] Catastrophe rate trending downward (from ~43% to <20%)
- [ ] Prediction accuracy trending upward (from ~31% to >50%)
- [ ] Trained model loads successfully
- [ ] All results committed to git
---
## WHAT COMES AFTER PART 1 (PART 2)
Once PART 1 is complete (8:00 PM), you'll have 9 hours until deadline (5:00 PM next day) to do PART 2:
**PART 2 Tasks:**
1. Write mini-blog or record <2min video explaining results
2. Update README with storytelling arc + curve + links
3. Push to HuggingFace Space
4. Update GitHub with final links
5. Submit Google Form
(PART 2 checklist will be provided separately once PART 1 is done)
---
## KEY FACTS
**PART 1 is the bottleneck.** Everything depends on getting GPU training to work.
**Judges explicitly state:** "At minimum, loss and reward plots from a real run."
**Right now:** You have 0/20 on "Training Evidence" criterion. After PART 1: You'll have 7/20.
**The difference:** Disqualification vs. Contention.
**What must happen:** Train for 7 hours, generate curves, commit results.
**Contingency:** If GPU fails, you can still explain the technical architecture to judges. But curves are what wins.
---
## IMMEDIATE NEXT STEPS
### Today (Before Venue):
- [ ] Print or bookmark `PART1_QUICK_SUMMARY.md` (2 pages, reference at venue)
- [ ] Review `PART1_DEVELOPMENT_TRAINING_CHECKLIST.md` (detailed steps)
- [ ] Verify training/config.yaml one more time
- [ ] Make sure laptop has repo cloned locally (backup copy)
### At Venue (11:30 AM):
- [ ] Find GPU
- [ ] Follow PART1_QUICK_SUMMARY.md steps 1-3
- [ ] Start training at 12:00 PM
- [ ] Follow post-training steps at 7:30 PM
- [ ] Curves ready by 8:00 PM
**That's the entire PART 1 plan. Nothing more complicated than that.**
|