Spaces:

vajeeda
/

MetaDebate

Sleeping

App Files Files Community

MetaDebate / prompts /update-data.md

vajeeda

final mvp created

98b952a about 1 month ago

preview code

raw

history blame contribute delete

6.09 kB

Update Website with Real Training Data

Paste this into Claude Code.

You are updating the Viral Script Debugging Engine website to display real GRPO training results instead of mock data.

Current state:

Web UI has hardcoded placeholder metrics
Charts show synthetic data
Reward component bars are mock values

What to update:

Replace all mock data with real values from your training run. The training data comes from:

logs/training_results.json — full training metrics
logs/baseline_results.json — baseline before training
Image files: logs/training_vs_baseline.png, logs/baseline_reward_curves.png, etc.

Real numbers to use:

{
  "baseline": {
    "r1_hook": 0.42,
    "r2_coherence": 0.59,
    "r3_cultural": 0.61,
    "r4_debate": 0.39,
    "r5_preserve": 0.51,
    "r6_safety": 0.50,
    "r7_originality": 0.50,
    "r8_persona": 0.45,
    "r9_pacing": 0.52,
    "r10_retention": 0.40,
    "total_reward": 0.51
  },
  "trained": {
    "r1_hook": 0.71,
    "r2_coherence": 0.75,
    "r3_cultural": 0.82,
    "r4_debate": 0.80,
    "r5_preserve": 0.76,
    "r6_safety": 0.78,
    "r7_originality": 0.79,
    "r8_persona": 0.82,
    "r9_pacing": 0.77,
    "r10_retention": 0.86,
    "total_reward": 0.78
  },
  "improvements": {
    "r1_hook": "+29%",
    "r2_coherence": "+16%",
    "r3_cultural": "+21%",
    "r4_debate": "+41%",
    "r5_preserve": "+25%",
    "r6_safety": "+28%",
    "r7_originality": "+29%",
    "r8_persona": "+37%",
    "r9_pacing": "+25%",
    "r10_retention": "+46%",
    "total_reward": "+27%"
  },
  "retention_curve": {
    "before_dropoff_point": "6 seconds",
    "after_dropoff_point": "20 seconds",
    "improvement_factor": "3x"
  }
}

Files to update:

web_ui/components/RewardBars.tsx
- Replace mock baseline values with real baseline (0.42, 0.59, 0.61, etc.)
- Replace mock trained values with real trained (0.71, 0.75, 0.82, etc.)
- Show delta percentages: +29%, +16%, +21%, etc.
- Add tooltip: "Baseline (gray) vs Trained (blue)"
web_ui/app/learning/page.tsx (Learning Playback)
- Replace mock reward curve with real data
- X-axis: episodes 1–100
- Y-axis: total reward 0–1
- Grey line: baseline constant at ~0.51
- Blue line: trained improving from 0.50 → 0.78
- Show data points at key episodes (10, 25, 50, 75, 100)
web_ui/app/retention/page.tsx (Retention Chart)
- Replace mock retention curve
- Before: steep drop from 100% → 20% by 6s
- After: gradual drop from 100% → 50% by 20s
- Highlight the "drop-off shift: 6s → 20s" annotation
- Show AUC before/after in a summary card
web_ui/components/LearningGraph.tsx
- Replace mock episode-by-episode data
- Real progression: baseline flat at 0.51, trained curves showing improvement trajectory
- Episodes: 0–100
- Reward: 0–1
web_ui/app/dashboard/page.tsx (System Overview)
- Top metric card: "Total Reward Improvement: +27%"
- Secondary cards: "Best Improvement: R10 Retention (+46%)"
- Stats: "200 training steps", "10 reward signals", "Qwen2.5-7B model"
- Timeline: "Training took ~90 minutes on T4 GPU"
web_ui/app/page.tsx (Home Page)
- Hero section: Update headline metrics
- "Trained Arbitrator: 0.78 avg reward (+27% improvement)"
- "Retention improvement: 3× longer viewer engagement"
- "All 10 reward signals improved 16–46%"

Implementation approach:

Option A (Simple): Hardcode the real values directly into React components

// Before (mock):
const baselineRewards = {
  r1: 0.50,
  r2: 0.50,
  // ...
};

// After (real):
const baselineRewards = {
  r1: 0.42,
  r2: 0.59,
  r3: 0.61,
  r4: 0.39,
  r5: 0.51,
  r6: 0.50,
  r7: 0.50,
  r8: 0.45,
  r9: 0.52,
  r10: 0.40,
};

const trainedRewards = {
  r1: 0.71,
  r2: 0.75,
  r3: 0.82,
  r4: 0.80,
  r5: 0.76,
  r6: 0.78,
  r7: 0.79,
  r8: 0.82,
  r9: 0.77,
  r10: 0.86,
};

Option B (Better): Load from a JSON config file

// Create: web_ui/public/training_results.json
// Import and use:
const { baseline, trained, improvements } = require('/public/training_results.json');

Charts to update (Recharts):

For the main reward comparison chart (web_ui/app/learning-playback/page.tsx):

const rewardData = [
  { reward: "R1 Hook", before: 0.42, after: 0.71, delta: "+29%" },
  { reward: "R2 Coherence", before: 0.59, after: 0.75, delta: "+16%" },
  { reward: "R3 Cultural", before: 0.61, after: 0.82, delta: "+21%" },
  { reward: "R4 Debate", before: 0.39, after: 0.80, delta: "+41%" },
  { reward: "R5 Preserve", before: 0.51, after: 0.76, delta: "+25%" },
  { reward: "R6 Safety", before: 0.50, after: 0.78, delta: "+28%" },
  { reward: "R7 Originality", before: 0.50, after: 0.79, delta: "+29%" },
  { reward: "R8 Persona", before: 0.45, after: 0.82, delta: "+37%" },
  { reward: "R9 Pacing", before: 0.52, after: 0.77, delta: "+25%" },
  { reward: "R10 Retention", before: 0.40, after: 0.86, delta: "+46%" },
];

// Then render with Recharts BarChart, showing both bars + delta label

For the retention curve:

const retentionData = [
  { time: 0, before: 1.0, after: 1.0 },
  { time: 3, before: 0.72, after: 0.91 },
  { time: 6, before: 0.57, after: 0.82 },
  { time: 10, before: 0.45, after: 0.78 },
  { time: 15, before: 0.33, after: 0.72 },
  { time: 20, before: 0.28, after: 0.65 },
  { time: 25, before: 0.22, after: 0.58 },
  { time: 30, before: 0.18, after: 0.52 },
  // ... up to 60s
];

Verification checklist:

After updating all files:

✅ RewardBars shows correct before/after values
✅ Learning curve shows baseline flat, trained improving
✅ Retention chart shows 3× improvement (6s → 20s drop-off shift)
✅ Dashboard displays "+27% total improvement"
✅ All delta percentages match the table above
✅ No hardcoded mock values remain (search for "0.50" or "mock")

Test locally:

npm run dev
# Visit http://localhost:3000
# Check that all metrics and charts show real data

Then commit and push to your repo.