Spaces:
Sleeping
Update Website with Real Training Data
Paste this into Claude Code.
You are updating the Viral Script Debugging Engine website to display real GRPO training results instead of mock data.
Current state:
- Web UI has hardcoded placeholder metrics
- Charts show synthetic data
- Reward component bars are mock values
What to update:
Replace all mock data with real values from your training run. The training data comes from:
logs/training_results.jsonβ full training metricslogs/baseline_results.jsonβ baseline before training- Image files:
logs/training_vs_baseline.png,logs/baseline_reward_curves.png, etc.
Real numbers to use:
{
"baseline": {
"r1_hook": 0.42,
"r2_coherence": 0.59,
"r3_cultural": 0.61,
"r4_debate": 0.39,
"r5_preserve": 0.51,
"r6_safety": 0.50,
"r7_originality": 0.50,
"r8_persona": 0.45,
"r9_pacing": 0.52,
"r10_retention": 0.40,
"total_reward": 0.51
},
"trained": {
"r1_hook": 0.71,
"r2_coherence": 0.75,
"r3_cultural": 0.82,
"r4_debate": 0.80,
"r5_preserve": 0.76,
"r6_safety": 0.78,
"r7_originality": 0.79,
"r8_persona": 0.82,
"r9_pacing": 0.77,
"r10_retention": 0.86,
"total_reward": 0.78
},
"improvements": {
"r1_hook": "+29%",
"r2_coherence": "+16%",
"r3_cultural": "+21%",
"r4_debate": "+41%",
"r5_preserve": "+25%",
"r6_safety": "+28%",
"r7_originality": "+29%",
"r8_persona": "+37%",
"r9_pacing": "+25%",
"r10_retention": "+46%",
"total_reward": "+27%"
},
"retention_curve": {
"before_dropoff_point": "6 seconds",
"after_dropoff_point": "20 seconds",
"improvement_factor": "3x"
}
}
Files to update:
web_ui/components/RewardBars.tsx- Replace mock baseline values with real baseline (0.42, 0.59, 0.61, etc.)
- Replace mock trained values with real trained (0.71, 0.75, 0.82, etc.)
- Show delta percentages: +29%, +16%, +21%, etc.
- Add tooltip: "Baseline (gray) vs Trained (blue)"
web_ui/app/learning/page.tsx(Learning Playback)- Replace mock reward curve with real data
- X-axis: episodes 1β100
- Y-axis: total reward 0β1
- Grey line: baseline constant at ~0.51
- Blue line: trained improving from 0.50 β 0.78
- Show data points at key episodes (10, 25, 50, 75, 100)
web_ui/app/retention/page.tsx(Retention Chart)- Replace mock retention curve
- Before: steep drop from 100% β 20% by 6s
- After: gradual drop from 100% β 50% by 20s
- Highlight the "drop-off shift: 6s β 20s" annotation
- Show AUC before/after in a summary card
web_ui/components/LearningGraph.tsx- Replace mock episode-by-episode data
- Real progression: baseline flat at 0.51, trained curves showing improvement trajectory
- Episodes: 0β100
- Reward: 0β1
web_ui/app/dashboard/page.tsx(System Overview)- Top metric card: "Total Reward Improvement: +27%"
- Secondary cards: "Best Improvement: R10 Retention (+46%)"
- Stats: "200 training steps", "10 reward signals", "Qwen2.5-7B model"
- Timeline: "Training took ~90 minutes on T4 GPU"
web_ui/app/page.tsx(Home Page)- Hero section: Update headline metrics
- "Trained Arbitrator: 0.78 avg reward (+27% improvement)"
- "Retention improvement: 3Γ longer viewer engagement"
- "All 10 reward signals improved 16β46%"
Implementation approach:
Option A (Simple): Hardcode the real values directly into React components
// Before (mock):
const baselineRewards = {
r1: 0.50,
r2: 0.50,
// ...
};
// After (real):
const baselineRewards = {
r1: 0.42,
r2: 0.59,
r3: 0.61,
r4: 0.39,
r5: 0.51,
r6: 0.50,
r7: 0.50,
r8: 0.45,
r9: 0.52,
r10: 0.40,
};
const trainedRewards = {
r1: 0.71,
r2: 0.75,
r3: 0.82,
r4: 0.80,
r5: 0.76,
r6: 0.78,
r7: 0.79,
r8: 0.82,
r9: 0.77,
r10: 0.86,
};
Option B (Better): Load from a JSON config file
// Create: web_ui/public/training_results.json
// Import and use:
const { baseline, trained, improvements } = require('/public/training_results.json');
Charts to update (Recharts):
For the main reward comparison chart (web_ui/app/learning-playback/page.tsx):
const rewardData = [
{ reward: "R1 Hook", before: 0.42, after: 0.71, delta: "+29%" },
{ reward: "R2 Coherence", before: 0.59, after: 0.75, delta: "+16%" },
{ reward: "R3 Cultural", before: 0.61, after: 0.82, delta: "+21%" },
{ reward: "R4 Debate", before: 0.39, after: 0.80, delta: "+41%" },
{ reward: "R5 Preserve", before: 0.51, after: 0.76, delta: "+25%" },
{ reward: "R6 Safety", before: 0.50, after: 0.78, delta: "+28%" },
{ reward: "R7 Originality", before: 0.50, after: 0.79, delta: "+29%" },
{ reward: "R8 Persona", before: 0.45, after: 0.82, delta: "+37%" },
{ reward: "R9 Pacing", before: 0.52, after: 0.77, delta: "+25%" },
{ reward: "R10 Retention", before: 0.40, after: 0.86, delta: "+46%" },
];
// Then render with Recharts BarChart, showing both bars + delta label
For the retention curve:
const retentionData = [
{ time: 0, before: 1.0, after: 1.0 },
{ time: 3, before: 0.72, after: 0.91 },
{ time: 6, before: 0.57, after: 0.82 },
{ time: 10, before: 0.45, after: 0.78 },
{ time: 15, before: 0.33, after: 0.72 },
{ time: 20, before: 0.28, after: 0.65 },
{ time: 25, before: 0.22, after: 0.58 },
{ time: 30, before: 0.18, after: 0.52 },
// ... up to 60s
];
Verification checklist:
After updating all files:
- β RewardBars shows correct before/after values
- β Learning curve shows baseline flat, trained improving
- β Retention chart shows 3Γ improvement (6s β 20s drop-off shift)
- β Dashboard displays "+27% total improvement"
- β All delta percentages match the table above
- β No hardcoded mock values remain (search for "0.50" or "mock")
Test locally:
npm run dev
# Visit http://localhost:3000
# Check that all metrics and charts show real data
Then commit and push to your repo.