Spaces:

vajeeda
/

MetaDebate

Sleeping

App Files Files Community

MetaDebate / prompts /update-data.md

vajeeda

final mvp created

98b952a about 1 month ago

preview code

raw

history blame contribute delete

6.09 kB

	# Update Website with Real Training Data
	> Paste this into Claude Code.

	You are updating the Viral Script Debugging Engine website to display real GRPO training results instead of mock data.

	Current state:
	- Web UI has hardcoded placeholder metrics
	- Charts show synthetic data
	- Reward component bars are mock values

	What to update:

	Replace all mock data with real values from your training run. The training data comes from:
	- `logs/training_results.json` — full training metrics
	- `logs/baseline_results.json` — baseline before training
	- Image files: `logs/training_vs_baseline.png`, `logs/baseline_reward_curves.png`, etc.

	Real numbers to use:

	```json
	{
	"baseline": {
	"r1_hook": 0.42,
	"r2_coherence": 0.59,
	"r3_cultural": 0.61,
	"r4_debate": 0.39,
	"r5_preserve": 0.51,
	"r6_safety": 0.50,
	"r7_originality": 0.50,
	"r8_persona": 0.45,
	"r9_pacing": 0.52,
	"r10_retention": 0.40,
	"total_reward": 0.51
	},
	"trained": {
	"r1_hook": 0.71,
	"r2_coherence": 0.75,
	"r3_cultural": 0.82,
	"r4_debate": 0.80,
	"r5_preserve": 0.76,
	"r6_safety": 0.78,
	"r7_originality": 0.79,
	"r8_persona": 0.82,
	"r9_pacing": 0.77,
	"r10_retention": 0.86,
	"total_reward": 0.78
	},
	"improvements": {
	"r1_hook": "+29%",
	"r2_coherence": "+16%",
	"r3_cultural": "+21%",
	"r4_debate": "+41%",
	"r5_preserve": "+25%",
	"r6_safety": "+28%",
	"r7_originality": "+29%",
	"r8_persona": "+37%",
	"r9_pacing": "+25%",
	"r10_retention": "+46%",
	"total_reward": "+27%"
	},
	"retention_curve": {
	"before_dropoff_point": "6 seconds",
	"after_dropoff_point": "20 seconds",
	"improvement_factor": "3x"
	}
	}
	```

	Files to update:

	1. `web_ui/components/RewardBars.tsx`
	- Replace mock baseline values with real baseline (0.42, 0.59, 0.61, etc.)
	- Replace mock trained values with real trained (0.71, 0.75, 0.82, etc.)
	- Show delta percentages: +29%, +16%, +21%, etc.
	- Add tooltip: "Baseline (gray) vs Trained (blue)"

	2. `web_ui/app/learning/page.tsx` (Learning Playback)
	- Replace mock reward curve with real data
	- X-axis: episodes 1–100
	- Y-axis: total reward 0–1
	- Grey line: baseline constant at ~0.51
	- Blue line: trained improving from 0.50 → 0.78
	- Show data points at key episodes (10, 25, 50, 75, 100)

	3. `web_ui/app/retention/page.tsx` (Retention Chart)
	- Replace mock retention curve
	- Before: steep drop from 100% → 20% by 6s
	- After: gradual drop from 100% → 50% by 20s
	- Highlight the "drop-off shift: 6s → 20s" annotation
	- Show AUC before/after in a summary card

	4. `web_ui/components/LearningGraph.tsx`
	- Replace mock episode-by-episode data
	- Real progression: baseline flat at 0.51, trained curves showing improvement trajectory
	- Episodes: 0–100
	- Reward: 0–1

	5. `web_ui/app/dashboard/page.tsx` (System Overview)
	- Top metric card: "Total Reward Improvement: +27%"
	- Secondary cards: "Best Improvement: R10 Retention (+46%)"
	- Stats: "200 training steps", "10 reward signals", "Qwen2.5-7B model"
	- Timeline: "Training took ~90 minutes on T4 GPU"

	6. `web_ui/app/page.tsx` (Home Page)
	- Hero section: Update headline metrics
	- "Trained Arbitrator: 0.78 avg reward (+27% improvement)"
	- "Retention improvement: 3× longer viewer engagement"
	- "All 10 reward signals improved 16–46%"

	Implementation approach:

	Option A (Simple): Hardcode the real values directly into React components
	```tsx
	// Before (mock):
	const baselineRewards = {
	r1: 0.50,
	r2: 0.50,
	// ...
	};

	// After (real):
	const baselineRewards = {
	r1: 0.42,
	r2: 0.59,
	r3: 0.61,
	r4: 0.39,
	r5: 0.51,
	r6: 0.50,
	r7: 0.50,
	r8: 0.45,
	r9: 0.52,
	r10: 0.40,
	};

	const trainedRewards = {
	r1: 0.71,
	r2: 0.75,
	r3: 0.82,
	r4: 0.80,
	r5: 0.76,
	r6: 0.78,
	r7: 0.79,
	r8: 0.82,
	r9: 0.77,
	r10: 0.86,
	};
	```

	Option B (Better): Load from a JSON config file
	```tsx
	// Create: web_ui/public/training_results.json
	// Import and use:
	const { baseline, trained, improvements } = require('/public/training_results.json');
	```

	Charts to update (Recharts):

	For the main reward comparison chart (`web_ui/app/learning-playback/page.tsx`):
	```tsx
	const rewardData = [
	{ reward: "R1 Hook", before: 0.42, after: 0.71, delta: "+29%" },
	{ reward: "R2 Coherence", before: 0.59, after: 0.75, delta: "+16%" },
	{ reward: "R3 Cultural", before: 0.61, after: 0.82, delta: "+21%" },
	{ reward: "R4 Debate", before: 0.39, after: 0.80, delta: "+41%" },
	{ reward: "R5 Preserve", before: 0.51, after: 0.76, delta: "+25%" },
	{ reward: "R6 Safety", before: 0.50, after: 0.78, delta: "+28%" },
	{ reward: "R7 Originality", before: 0.50, after: 0.79, delta: "+29%" },
	{ reward: "R8 Persona", before: 0.45, after: 0.82, delta: "+37%" },
	{ reward: "R9 Pacing", before: 0.52, after: 0.77, delta: "+25%" },
	{ reward: "R10 Retention", before: 0.40, after: 0.86, delta: "+46%" },
	];

	// Then render with Recharts BarChart, showing both bars + delta label
	```

	For the retention curve:
	```tsx
	const retentionData = [
	{ time: 0, before: 1.0, after: 1.0 },
	{ time: 3, before: 0.72, after: 0.91 },
	{ time: 6, before: 0.57, after: 0.82 },
	{ time: 10, before: 0.45, after: 0.78 },
	{ time: 15, before: 0.33, after: 0.72 },
	{ time: 20, before: 0.28, after: 0.65 },
	{ time: 25, before: 0.22, after: 0.58 },
	{ time: 30, before: 0.18, after: 0.52 },
	// ... up to 60s
	];
	```

	Verification checklist:

	After updating all files:
	- ✅ RewardBars shows correct before/after values
	- ✅ Learning curve shows baseline flat, trained improving
	- ✅ Retention chart shows 3× improvement (6s → 20s drop-off shift)
	- ✅ Dashboard displays "+27% total improvement"
	- ✅ All delta percentages match the table above
	- ✅ No hardcoded mock values remain (search for "0.50" or "mock")

	Test locally:
	```bash
	npm run dev
	# Visit http://localhost:3000
	# Check that all metrics and charts show real data
	```

	Then commit and push to your repo.