cy0307 commited on
Commit
c9dfd11
·
verified ·
1 Parent(s): 351650d

Publish Ropedia Xperience-10M task baseline cards

Browse files
README.md CHANGED
@@ -18,7 +18,7 @@ metrics:
18
  - mean-reciprocal-rank
19
  - mean-squared-error
20
  model-index:
21
- - name: Xperience-10M Minimal and Neural Task Baselines
22
  results:
23
  - task:
24
  type: robotics
@@ -55,7 +55,7 @@ model-index:
55
  name: neural MLP F1
56
  ---
57
 
58
- # Xperience-10M Minimal and Neural Task Baselines
59
 
60
  This repo stores the minimal baseline weights, neural MLP task-head checkpoints, and metrics for the 12-task Xperience-10M episode suite. It is meant to be read like a model audit, not advertised as a robot foundation model.
61
 
@@ -113,19 +113,19 @@ transfers them to H20 for manifest building, training, and evaluation.
113
 
114
  The companion artifact dataset repo stores CSV/JSON predictions and dashboard assets:
115
 
116
- https://huggingface.co/datasets/cy0307/ropedia-episode-task-suite-artifacts
117
 
118
  The public visual dashboard is here:
119
 
120
- https://huggingface.co/spaces/cy0307/ropedia-episode-task-suite
121
 
122
  Direct static app:
123
 
124
- https://cy0307-ropedia-episode-task-suite.static.hf.space/
125
 
126
  The full Hugging Face collection is here:
127
 
128
- https://huggingface.co/collections/cy0307/ropedia-episode-task-suite
129
 
130
  ## Minimal and Neural Architecture
131
 
@@ -174,8 +174,8 @@ This repo does not redistribute raw Xperience-10M videos or raw `annotation.hdf5
174
 
175
  GitHub:
176
 
177
- https://github.com/ChaoYue0307/ropedia-episode-task-suite
178
 
179
  GitHub Pages:
180
 
181
- https://chaoyue0307.github.io/ropedia-episode-task-suite/
 
18
  - mean-reciprocal-rank
19
  - mean-squared-error
20
  model-index:
21
+ - name: Ropedia Xperience-10M Task Baselines
22
  results:
23
  - task:
24
  type: robotics
 
55
  name: neural MLP F1
56
  ---
57
 
58
+ # Ropedia Xperience-10M Task Baselines
59
 
60
  This repo stores the minimal baseline weights, neural MLP task-head checkpoints, and metrics for the 12-task Xperience-10M episode suite. It is meant to be read like a model audit, not advertised as a robot foundation model.
61
 
 
113
 
114
  The companion artifact dataset repo stores CSV/JSON predictions and dashboard assets:
115
 
116
+ https://huggingface.co/datasets/cy0307/ropedia-xperience-10m-task-suite-artifacts
117
 
118
  The public visual dashboard is here:
119
 
120
+ https://huggingface.co/spaces/cy0307/ropedia-xperience-10m-task-suite
121
 
122
  Direct static app:
123
 
124
+ https://cy0307-ropedia-xperience-10m-task-suite.static.hf.space/
125
 
126
  The full Hugging Face collection is here:
127
 
128
+ https://huggingface.co/collections/cy0307/ropedia-xperience-10m-task-suite
129
 
130
  ## Minimal and Neural Architecture
131
 
 
174
 
175
  GitHub:
176
 
177
+ https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite
178
 
179
  GitHub Pages:
180
 
181
+ https://chaoyue0307.github.io/ropedia-xperience-10m-task-suite/
assets/charts/episode_task_scores.svg CHANGED
assets/charts/episode_task_scores_neural_mlp.svg CHANGED
assets/pipeline_diagram.png CHANGED

Git LFS Details

  • SHA256: 25ccbd6b360d2df40c5342bb6a0a55cbc84ab1e4f9a243e4e5addc31f659fd37
  • Pointer size: 131 Bytes
  • Size of remote file: 864 kB

Git LFS Details

  • SHA256: a4d5c5393952f8e399d6af3cb92a179a1de52094fa675ec62a48c95734d4c4e5
  • Pointer size: 131 Bytes
  • Size of remote file: 868 kB
assets/pipeline_diagram.svg CHANGED
assets/task_architectures.png CHANGED

Git LFS Details

  • SHA256: 2b4fd09da684a1e28341ba3a2d93b5f98908d01d31992d0bece2819b1acdec19
  • Pointer size: 131 Bytes
  • Size of remote file: 816 kB

Git LFS Details

  • SHA256: 1d8882e017283413e98a46b92da131d0b7a5f0dd4f1d296ee3e4a698bed02504
  • Pointer size: 131 Bytes
  • Size of remote file: 816 kB
assets/task_architectures.svg CHANGED
assets/task_suite_infographic.png CHANGED

Git LFS Details

  • SHA256: 0e6b4f8661fe1b3df902b685ef992a3ed01364ee43eccb8e5e6b7da0e4ce4183
  • Pointer size: 132 Bytes
  • Size of remote file: 1.25 MB

Git LFS Details

  • SHA256: 6dd90f87b7e6c4aaecf6568fff39cdf2640e0e03c11d6c9664e62a9a9e1daedd
  • Pointer size: 132 Bytes
  • Size of remote file: 1.15 MB
notes/reproducibility_audit.md CHANGED
@@ -2,7 +2,7 @@
2
 
3
  Audit date: 2026-05-30 Asia/Singapore.
4
 
5
- Purpose: verify that the committed Xperience-10M Episode Task Suite artifacts are
6
  real outputs from the scripts, not placeholder or fabricated metrics.
7
 
8
  ## Raw Inputs Checked
 
2
 
3
  Audit date: 2026-05-30 Asia/Singapore.
4
 
5
+ Purpose: verify that the committed Ropedia Xperience-10M Task Suite artifacts are
6
  real outputs from the scripts, not placeholder or fabricated metrics.
7
 
8
  ## Raw Inputs Checked
scripts/generate_visualizations.py CHANGED
@@ -24,6 +24,22 @@ DOCS = ROOT / "docs"
24
  ASSETS = DOCS / "assets"
25
  CHARTS = ASSETS / "charts"
26
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
 
28
  def read_json(path: Path) -> dict:
29
  return json.loads(path.read_text(encoding="utf-8"))
@@ -100,7 +116,7 @@ def svg_pipeline_diagram(path: Path, summary: dict) -> None:
100
  "numpy softmax classifier",
101
  "metrics and predictions",
102
  ], "#1f63e9"),
103
- (520, 380, 360, 168, "6. Episode task suite", [
104
  f"{task_count} supervised/self-supervised tasks",
105
  "chronological split",
106
  "retrieval, forecast, alignment",
@@ -117,7 +133,7 @@ def svg_pipeline_diagram(path: Path, summary: dict) -> None:
117
  f'<svg xmlns="http://www.w3.org/2000/svg" width="{width}" height="{height}" viewBox="0 0 {width} {height}">',
118
  '<rect width="100%" height="100%" fill="#ffffff"/>',
119
  '<rect x="0" y="0" width="1400" height="760" fill="#ffffff"/>',
120
- '<text x="60" y="58" font-family="Arial, sans-serif" font-size="32" font-weight="700" fill="#10141f">Verified Xperience-10M Episode Pipeline</text>',
121
  '<text x="60" y="88" font-family="Arial, sans-serif" font-size="16" fill="#5b6475">Generated from committed scripts and metrics; no conceptual placeholder stages.</text>',
122
  ]
123
  arrows = [
@@ -333,7 +349,7 @@ def svg_task_architectures(path: Path, summary: dict) -> None:
333
  f'<svg xmlns="http://www.w3.org/2000/svg" width="{width}" height="{height}" viewBox="0 0 {width} {height}">',
334
  '<defs><marker id="arrow2" viewBox="0 0 10 10" refX="8" refY="5" markerWidth="7" markerHeight="7" orient="auto-start-reverse"><path d="M 0 0 L 10 5 L 0 10 z" fill="#cbd5e1"/></marker></defs>',
335
  '<rect width="100%" height="100%" fill="#ffffff"/>',
336
- '<text x="60" y="56" font-family="Arial, sans-serif" font-size="34" font-weight="700" fill="#10141f">Minimal Architectures for the 12 Xperience-10M Episode Tasks</text>',
337
  '<text x="60" y="88" font-family="Arial, sans-serif" font-size="16" fill="#5b6475">Generated from scripts/episode_task_suite.py semantics and committed summary metrics. These are minimal baselines, not deep foundation models.</text>',
338
  ]
339
 
@@ -419,6 +435,7 @@ def collect_summary() -> dict:
419
  suite = read_json(RESULTS / "episode_task_suite/summary_report.json")
420
  manifest = read_json(RESULTS / "episode_task_suite/feature_manifest.json")
421
  return {
 
422
  "models": {
423
  "motion_action": min_action,
424
  "motion_subtask": min_subtask,
@@ -453,13 +470,13 @@ def generate_charts(summary: dict) -> None:
453
  task_rows = []
454
  for task_name, metrics in suite.items():
455
  task_rows.append((task_name, task_score(metrics)))
456
- svg_bar_chart(CHARTS / "episode_task_scores.svg", "Episode Task Suite: Main Scores", task_rows, max_value=1.0)
457
 
458
  neural = summary["suite"].get("neural_tasks", {})
459
  if neural:
460
  neural_rows = [(task_name, task_score(metrics)) for task_name, metrics in neural.items() if "error" not in metrics]
461
  if neural_rows:
462
- svg_bar_chart(CHARTS / "episode_task_scores_neural_mlp.svg", "Episode Task Suite: Neural MLP Main Scores", neural_rows, max_value=1.0)
463
 
464
  comparison_rows = []
465
  for task_name, metrics in suite.items():
@@ -484,7 +501,7 @@ def generate_charts(summary: dict) -> None:
484
  def write_summary_data(summary: dict) -> None:
485
  DOCS.mkdir(parents=True, exist_ok=True)
486
  (DOCS / "data").mkdir(parents=True, exist_ok=True)
487
- (DOCS / "data/summary_metrics.json").write_text(json.dumps(summary, indent=2), encoding="utf-8")
488
 
489
 
490
  def main() -> int:
 
24
  ASSETS = DOCS / "assets"
25
  CHARTS = ASSETS / "charts"
26
 
27
+ OMNI_RELAY = {
28
+ "status": "pending_huggingface_gated_access",
29
+ "dataset": "ropedia-ai/xperience-10m",
30
+ "relay_server": "ANGEL-A100-80Gx4",
31
+ "training_server": "ANGEL-H20-96GX8",
32
+ "selection_strategy": "stratified_round_robin_by_top_level_session",
33
+ "target_episodes": 32,
34
+ "selected_sessions": 32,
35
+ "candidate_scan_top_level_sessions": 64,
36
+ "valid_candidates": 680,
37
+ "estimated_bytes": 72031620552,
38
+ "exclude": ["visualization.rrd"],
39
+ "blocker": "Hugging Face returns 403 pending review for the full Xperience-10M gated dataset.",
40
+ "claim_boundary": "No real 32-episode fine-tune is claimed until the watcher downloads data, transfers it to H20, and the held-out evaluation runs.",
41
+ }
42
+
43
 
44
  def read_json(path: Path) -> dict:
45
  return json.loads(path.read_text(encoding="utf-8"))
 
116
  "numpy softmax classifier",
117
  "metrics and predictions",
118
  ], "#1f63e9"),
119
+ (520, 380, 360, 168, "6. Ropedia Xperience-10M suite", [
120
  f"{task_count} supervised/self-supervised tasks",
121
  "chronological split",
122
  "retrieval, forecast, alignment",
 
133
  f'<svg xmlns="http://www.w3.org/2000/svg" width="{width}" height="{height}" viewBox="0 0 {width} {height}">',
134
  '<rect width="100%" height="100%" fill="#ffffff"/>',
135
  '<rect x="0" y="0" width="1400" height="760" fill="#ffffff"/>',
136
+ '<text x="60" y="58" font-family="Arial, sans-serif" font-size="32" font-weight="700" fill="#10141f">Verified Ropedia Xperience-10M Pipeline</text>',
137
  '<text x="60" y="88" font-family="Arial, sans-serif" font-size="16" fill="#5b6475">Generated from committed scripts and metrics; no conceptual placeholder stages.</text>',
138
  ]
139
  arrows = [
 
349
  f'<svg xmlns="http://www.w3.org/2000/svg" width="{width}" height="{height}" viewBox="0 0 {width} {height}">',
350
  '<defs><marker id="arrow2" viewBox="0 0 10 10" refX="8" refY="5" markerWidth="7" markerHeight="7" orient="auto-start-reverse"><path d="M 0 0 L 10 5 L 0 10 z" fill="#cbd5e1"/></marker></defs>',
351
  '<rect width="100%" height="100%" fill="#ffffff"/>',
352
+ '<text x="60" y="56" font-family="Arial, sans-serif" font-size="34" font-weight="700" fill="#10141f">Minimal Architectures for 12 Ropedia Xperience-10M Tasks</text>',
353
  '<text x="60" y="88" font-family="Arial, sans-serif" font-size="16" fill="#5b6475">Generated from scripts/episode_task_suite.py semantics and committed summary metrics. These are minimal baselines, not deep foundation models.</text>',
354
  ]
355
 
 
435
  suite = read_json(RESULTS / "episode_task_suite/summary_report.json")
436
  manifest = read_json(RESULTS / "episode_task_suite/feature_manifest.json")
437
  return {
438
+ "omni_relay": OMNI_RELAY,
439
  "models": {
440
  "motion_action": min_action,
441
  "motion_subtask": min_subtask,
 
470
  task_rows = []
471
  for task_name, metrics in suite.items():
472
  task_rows.append((task_name, task_score(metrics)))
473
+ svg_bar_chart(CHARTS / "episode_task_scores.svg", "Ropedia Xperience-10M Suite: Main Scores", task_rows, max_value=1.0)
474
 
475
  neural = summary["suite"].get("neural_tasks", {})
476
  if neural:
477
  neural_rows = [(task_name, task_score(metrics)) for task_name, metrics in neural.items() if "error" not in metrics]
478
  if neural_rows:
479
+ svg_bar_chart(CHARTS / "episode_task_scores_neural_mlp.svg", "Ropedia Xperience-10M Suite: Neural MLP Main Scores", neural_rows, max_value=1.0)
480
 
481
  comparison_rows = []
482
  for task_name, metrics in suite.items():
 
501
  def write_summary_data(summary: dict) -> None:
502
  DOCS.mkdir(parents=True, exist_ok=True)
503
  (DOCS / "data").mkdir(parents=True, exist_ok=True)
504
+ (DOCS / "data/summary_metrics.json").write_text(json.dumps(summary, indent=2) + "\n", encoding="utf-8")
505
 
506
 
507
  def main() -> int:
scripts/omni/run_omni_finetune_8gpu.sh ADDED
@@ -0,0 +1,138 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ set -euo pipefail
3
+
4
+ WORKSPACE="${WORKSPACE:-/home/cy/Ropedia/ropedia-xperience-10m-task-suite}"
5
+ PROJECT_ROOT="${PROJECT_ROOT:-/home/cy/Ropedia}"
6
+ VENV_PY="${VENV_PY:-$WORKSPACE/.venv/bin/python}"
7
+ RUN_ID="${RUN_ID:-xperience10m_qwen3_omni_32ep}"
8
+ DATA_ROOT="${DATA_ROOT:-$PROJECT_ROOT/modelscope_data}"
9
+ MAX_EPISODES="${MAX_EPISODES:-32}"
10
+ MAX_WINDOWS_PER_EPISODE="${MAX_WINDOWS_PER_EPISODE:-128}"
11
+ MAX_VIDEO_FRAMES="${MAX_VIDEO_FRAMES:-16}"
12
+ EPOCHS="${EPOCHS:-1}"
13
+ TRAIN_SPLIT="${TRAIN_SPLIT:-train}"
14
+ VAL_SPLIT="${VAL_SPLIT:-val}"
15
+ EVAL_SPLIT="${EVAL_SPLIT:-test}"
16
+ MODEL_ID="${MODEL_ID:-Qwen/Qwen3-Omni-30B-A3B-Instruct}"
17
+ LOCAL_MODEL_DIR="${LOCAL_MODEL_DIR:-$PROJECT_ROOT/modelscope_models/Qwen__Qwen3-Omni-30B-A3B-Instruct}"
18
+
19
+ RESULT_DIR="$WORKSPACE/results/omni_finetune/$RUN_ID"
20
+ DATASET_RUN_ID="${RUN_ID}_dataset"
21
+ DATASET_DIR="$WORKSPACE/results/omni_finetune/$DATASET_RUN_ID"
22
+ MANIFEST="$RESULT_DIR/episode_manifest.json"
23
+ LOG_DIR="$RESULT_DIR/logs"
24
+ mkdir -p "$LOG_DIR" "$LOCAL_MODEL_DIR"
25
+
26
+ exec > >(tee -a "$LOG_DIR/pipeline.log") 2>&1
27
+
28
+ cd "$WORKSPACE"
29
+
30
+ phase() {
31
+ echo "PHASE: $1"
32
+ "$VENV_PY" - <<PY
33
+ import json, time
34
+ path = "$RESULT_DIR/pipeline_status.jsonl"
35
+ with open(path, "a", encoding="utf-8") as fp:
36
+ fp.write(json.dumps({"event": "phase", "phase": "$1", "timestamp": time.time()}) + "\\n")
37
+ PY
38
+ }
39
+
40
+ phase "preflight"
41
+ nvidia-smi --query-gpu=index,name,memory.total,memory.used,utilization.gpu --format=csv,noheader,nounits
42
+ "$VENV_PY" - <<'PY'
43
+ mods = ["torch", "transformers", "accelerate", "peft", "qwen_omni_utils", "soundfile", "librosa", "imageio_ffmpeg", "modelscope"]
44
+ for mod in mods:
45
+ __import__(mod)
46
+ print(f"{mod}: ok")
47
+ PY
48
+
49
+ phase "download_qwen3_omni_instruct"
50
+ if ! compgen -G "$LOCAL_MODEL_DIR/*.safetensors" > /dev/null && ! compgen -G "$LOCAL_MODEL_DIR/*.bin" > /dev/null; then
51
+ if command -v modelscope >/dev/null 2>&1; then
52
+ modelscope download --model "$MODEL_ID" --local_dir "$LOCAL_MODEL_DIR"
53
+ else
54
+ "$VENV_PY" -m modelscope download --model "$MODEL_ID" --local_dir "$LOCAL_MODEL_DIR"
55
+ fi
56
+ else
57
+ echo "Model weights already present in $LOCAL_MODEL_DIR"
58
+ fi
59
+
60
+ phase "build_manifest"
61
+ "$VENV_PY" scripts/omni/build_episode_manifest.py \
62
+ --data-root "$DATA_ROOT" \
63
+ --max-episodes "$MAX_EPISODES" \
64
+ --train-fraction 0.8 \
65
+ --val-fraction 0.0 \
66
+ --test-fraction 0.2 \
67
+ --output "$MANIFEST"
68
+
69
+ EVAL_SPLIT="$("$VENV_PY" - <<PY
70
+ import json
71
+ payload = json.load(open("$MANIFEST", "r", encoding="utf-8"))
72
+ counts = payload.get("summary", {}).get("split_counts", {})
73
+ requested = "$EVAL_SPLIT"
74
+ if counts.get(requested, 0):
75
+ print(requested)
76
+ elif counts.get("test", 0):
77
+ print("test")
78
+ elif counts.get("val", 0):
79
+ print("val")
80
+ else:
81
+ print("train")
82
+ PY
83
+ )"
84
+ echo "Using eval split: $EVAL_SPLIT"
85
+
86
+ phase "export_dataset"
87
+ "$VENV_PY" scripts/omni/export_qwen3_omni_action_dataset.py \
88
+ --manifest "$MANIFEST" \
89
+ --run-id "$DATASET_RUN_ID" \
90
+ --max-windows-per-episode "$MAX_WINDOWS_PER_EPISODE" \
91
+ --max-video-frames "$MAX_VIDEO_FRAMES"
92
+
93
+ DATASET_JSONL="$DATASET_DIR/dataset.jsonl"
94
+
95
+ phase "qwen_zero_shot_smoke"
96
+ "$VENV_PY" scripts/omni/qwen3_omni_inference_smoke.py \
97
+ --dataset-jsonl "$DATASET_JSONL" \
98
+ --model-id "$LOCAL_MODEL_DIR" \
99
+ --split "$EVAL_SPLIT" \
100
+ --sample-limit 3 \
101
+ --run-id "${RUN_ID}_zero_shot" \
102
+ --local-files-only || true
103
+
104
+ phase "train_8gpu_lora"
105
+ CUDA_VISIBLE_DEVICES="${CUDA_VISIBLE_DEVICES:-0,1,2,3,4,5,6,7}" \
106
+ "$VENV_PY" -m accelerate.commands.launch \
107
+ --num_processes 8 \
108
+ --mixed_precision bf16 \
109
+ scripts/omni/train_qwen3_omni_lora.py \
110
+ --dataset-jsonl "$DATASET_JSONL" \
111
+ --model-id "$LOCAL_MODEL_DIR" \
112
+ --run-id "${RUN_ID}_lora" \
113
+ --train-split "$TRAIN_SPLIT" \
114
+ --val-split "$VAL_SPLIT" \
115
+ --epochs "$EPOCHS" \
116
+ --batch-size 1 \
117
+ --gradient-accumulation-steps 8 \
118
+ --max-train-samples 0 \
119
+ --max-val-samples 64 \
120
+ --local-files-only
121
+
122
+ phase "eval_lora"
123
+ "$VENV_PY" scripts/omni/eval_qwen3_omni_lora.py \
124
+ --dataset-jsonl "$DATASET_JSONL" \
125
+ --model-id "$LOCAL_MODEL_DIR" \
126
+ --adapter-dir "$WORKSPACE/checkpoints/${RUN_ID}_lora/adapter_lora" \
127
+ --run-id "${RUN_ID}_eval" \
128
+ --eval-split "$EVAL_SPLIT" \
129
+ --local-files-only
130
+
131
+ phase "runbook"
132
+ "$VENV_PY" scripts/omni/omni_finetune_runbook.py \
133
+ --run-id "$RUN_ID" \
134
+ --manifest "$MANIFEST" \
135
+ --metric-file "$WORKSPACE/results/omni_finetune/${RUN_ID}_eval/metrics.json" || true
136
+
137
+ phase "complete"
138
+ echo "DONE: $RUN_ID"
scripts/omni/transfer_xperience10m_a100_to_h20.sh CHANGED
@@ -13,4 +13,4 @@ rsync -avP --partial --append-verify \
13
  "${H20_HOST}:${H20_DATA_ROOT}"
14
 
15
  ssh -i "${SSH_KEY}" -o BatchMode=yes -o StrictHostKeyChecking=accept-new "${H20_HOST}" \
16
- "cd /home/cy/Ropedia/ropedia-episode-task-suite && python3 scripts/omni/discover_xperience10m_sources.py --workspace /home/cy/Ropedia/ropedia-episode-task-suite --data-root /home/cy/Ropedia/modelscope_data --output results/omni_finetune/source_discovery.json --report-output results/omni_finetune/DATA_BLOCKER_REPORT.md"
 
13
  "${H20_HOST}:${H20_DATA_ROOT}"
14
 
15
  ssh -i "${SSH_KEY}" -o BatchMode=yes -o StrictHostKeyChecking=accept-new "${H20_HOST}" \
16
+ "cd /home/cy/Ropedia/ropedia-xperience-10m-task-suite && python3 scripts/omni/discover_xperience10m_sources.py --workspace /home/cy/Ropedia/ropedia-xperience-10m-task-suite --data-root /home/cy/Ropedia/modelscope_data --output results/omni_finetune/source_discovery.json --report-output results/omni_finetune/DATA_BLOCKER_REPORT.md"
scripts/render_overview_figures.py CHANGED
@@ -94,6 +94,7 @@ def arrow() -> str:
94
  def build_pipeline_html(summary: dict, base_path: Path) -> str:
95
  suite = summary["suite"]
96
  task_count = len(suite["tasks"])
 
97
  stage_rows = [
98
  [
99
  stage_card(
@@ -132,21 +133,21 @@ def build_pipeline_html(summary: dict, base_path: Path) -> str:
132
  stage_card(
133
  "05",
134
  "Baseline models",
135
- ["motion-only classifiers", "current all-feature classifiers", "stored weights + predictions"],
136
  COLORS["blue"],
137
  ),
138
  arrow(),
139
  stage_card(
140
  "06",
141
- "Episode task suite",
142
- [f"{task_count} task contracts", "forecast, retrieval, alignment", "chronological evaluation"],
143
  COLORS["teal"],
144
  ),
145
  arrow(),
146
  stage_card(
147
  "07",
148
  "Published artifacts",
149
- ["metrics.json / csv / npz", "GitHub Pages dashboard", "HF Space + dataset + model card"],
150
  COLORS["green"],
151
  ),
152
  ],
@@ -156,6 +157,7 @@ def build_pipeline_html(summary: dict, base_path: Path) -> str:
156
  "Audit check: rerunning scripts to /private/tmp reproduced the committed metrics exactly.",
157
  "Modality check: sample covers video, AAC audio, depth, pose/SLAM, mocap, IMU, and language annotation.",
158
  "Feature check: current baseline manifest has video/depth/pose/mocap/IMU/language blocks, but no audio feature block.",
 
159
  "Scope check: this validates one public sample episode, not cross-episode generalization.",
160
  ]
161
  checks_html = "".join(f"<li>{esc(line)}</li>" for line in checks)
@@ -356,14 +358,14 @@ def build_pipeline_html(summary: dict, base_path: Path) -> str:
356
  <header>
357
  <div>
358
  <div class="kicker">verified single-episode pipeline</div>
359
- <h1>From Xperience-10M episode to reproducible artifacts</h1>
360
- <p class="subtitle">The figure follows the actual code path and separates the full Xperience-10M sample modalities from the current baseline feature manifest.</p>
361
  </div>
362
  <div class="metrics">
363
  <div class="metric"><strong>{suite['num_frames']:,}</strong><span>frames</span></div>
364
  <div class="metric"><strong>{suite['num_windows']:,}</strong><span>windows</span></div>
365
  <div class="metric"><strong>{suite['feature_dim']:,}</strong><span>features</span></div>
366
- <div class="metric"><strong>{task_count}</strong><span>tasks</span></div>
367
  </div>
368
  </header>
369
  {rows_html}
@@ -404,6 +406,7 @@ def build_task_card(row: dict, color: str) -> str:
404
 
405
  def build_architecture_html(summary: dict, base_path: Path) -> str:
406
  suite = summary["suite"]
 
407
  rows_by_task = {row["task"]: row for row in task_architecture_rows(summary)}
408
  group_html = []
409
  for title, color, task_names in TASK_GROUPS:
@@ -421,10 +424,10 @@ def build_architecture_html(summary: dict, base_path: Path) -> str:
421
  )
422
 
423
  family_cards = [
424
- ("Linear softmax", "Class-weighted CE + L2 for action, subtask, next-action, transition, contact, order, and alignment classifiers.", COLORS["blue"]),
425
- ("Ridge regression", "Closed-form dual ridge for hand trajectory forecasting and modality reconstruction.", COLORS["green"]),
426
- ("Ridge + cosine rank", "Project sensor features into text or visual space, then rank candidate windows by cosine similarity.", COLORS["teal"]),
427
- ("Multi-label logistic", "One-vs-rest sigmoid heads over the object vocabulary with top-1 fallback.", COLORS["orange"]),
428
  ]
429
  families = "".join(
430
  f"""
@@ -693,17 +696,17 @@ def build_architecture_html(summary: dict, base_path: Path) -> str:
693
  <div class="content">
694
  <header>
695
  <div>
696
- <div class="kicker">minimal verified model architectures</div>
697
- <h1>12 Xperience-10M episode tasks, four reusable heads</h1>
698
- <p class="subtitle">Each task uses the same aligned episode-window contract, then swaps only the minimal output head needed for labels, forecasting, grounding, reconstruction, or temporal diagnostics.</p>
699
  </div>
700
- <div class="summary-pill"><strong>{len(suite['tasks'])}</strong><span>end-to-end tasks</span></div>
701
  </header>
702
  <section class="shared">
703
  <article><h2>Shared windows</h2><p>{suite['num_frames']:,} frames to {suite['num_windows']:,} windows over video, depth, pose, mocap, inertial, and language features.</p></article>
704
  <article><h2>Feature vector</h2><p>X_all is {suite['feature_dim']:,} dimensions with 17 named blocks; sample audio is documented but not featurized here.</p></article>
705
- <article><h2>Reusable heads</h2><p>Softmax, ridge, ridge ranking, and multi-label logistic heads cover the whole suite.</p></article>
706
- <article><h2>Artifacts</h2><p>Metrics, predictions, models, manifests, and the source summary report are committed.</p></article>
707
  </section>
708
  <section class="families">{families}</section>
709
  <section class="task-groups">{"".join(group_html)}</section>
 
94
  def build_pipeline_html(summary: dict, base_path: Path) -> str:
95
  suite = summary["suite"]
96
  task_count = len(suite["tasks"])
97
+ neural_count = len(suite.get("neural_tasks", {}))
98
  stage_rows = [
99
  [
100
  stage_card(
 
133
  stage_card(
134
  "05",
135
  "Baseline models",
136
+ ["motion-only classifiers", "current all-feature classifiers", "neural MLP task heads"],
137
  COLORS["blue"],
138
  ),
139
  arrow(),
140
  stage_card(
141
  "06",
142
+ "Ropedia Xperience-10M suite",
143
+ [f"{task_count} minimal + {neural_count} neural results", "forecast, retrieval, alignment", "chronological evaluation"],
144
  COLORS["teal"],
145
  ),
146
  arrow(),
147
  stage_card(
148
  "07",
149
  "Published artifacts",
150
+ ["metrics.json / csv / npz / pt", "GitHub Pages dashboard", "NN comparison charts"],
151
  COLORS["green"],
152
  ),
153
  ],
 
157
  "Audit check: rerunning scripts to /private/tmp reproduced the committed metrics exactly.",
158
  "Modality check: sample covers video, AAC audio, depth, pose/SLAM, mocap, IMU, and language annotation.",
159
  "Feature check: current baseline manifest has video/depth/pose/mocap/IMU/language blocks, but no audio feature block.",
160
+ "Neural check: lightweight PyTorch MLP heads are reported beside the minimal task heads under neural_mlp/.",
161
  "Scope check: this validates one public sample episode, not cross-episode generalization.",
162
  ]
163
  checks_html = "".join(f"<li>{esc(line)}</li>" for line in checks)
 
358
  <header>
359
  <div>
360
  <div class="kicker">verified single-episode pipeline</div>
361
+ <h1>From Ropedia Xperience-10M episode to reproducible artifacts</h1>
362
+ <p class="subtitle">The figure follows the actual code path and includes minimal heads plus neural MLP results. Next TODO: Qwen3-Omni fine-tuning and sensor-bridge evaluation on multi-episode splits.</p>
363
  </div>
364
  <div class="metrics">
365
  <div class="metric"><strong>{suite['num_frames']:,}</strong><span>frames</span></div>
366
  <div class="metric"><strong>{suite['num_windows']:,}</strong><span>windows</span></div>
367
  <div class="metric"><strong>{suite['feature_dim']:,}</strong><span>features</span></div>
368
+ <div class="metric"><strong>{task_count}+{neural_count}</strong><span>min + NN tasks</span></div>
369
  </div>
370
  </header>
371
  {rows_html}
 
406
 
407
  def build_architecture_html(summary: dict, base_path: Path) -> str:
408
  suite = summary["suite"]
409
+ neural_count = len(suite.get("neural_tasks", {}))
410
  rows_by_task = {row["task"]: row for row in task_architecture_rows(summary)}
411
  group_html = []
412
  for title, color, task_names in TASK_GROUPS:
 
424
  )
425
 
426
  family_cards = [
427
+ ("Linear softmax", "Minimal classifier for action, subtask, transition, contact, order, and alignment tasks.", COLORS["blue"]),
428
+ ("Ridge regression", "Minimal closed-form projection for forecasting, reconstruction, and retrieval spaces.", COLORS["green"]),
429
+ ("Multi-label logistic", "Minimal one-vs-rest sigmoid heads over the object vocabulary with top-1 fallback.", COLORS["orange"]),
430
+ ("Neural MLP", "Optional PyTorch nonlinear classifier/regressor over the same features, splits, and metrics.", COLORS["red"]),
431
  ]
432
  families = "".join(
433
  f"""
 
696
  <div class="content">
697
  <header>
698
  <div>
699
+ <div class="kicker">minimal + neural verified model architectures</div>
700
+ <h1>12 Ropedia Xperience-10M tasks, minimal and NN heads</h1>
701
+ <p class="subtitle">Each task uses the same aligned episode-window contract. The figure shows minimal heads beside neural MLP metrics; next TODO is Qwen3-Omni fine-tuning with sensor-bridge evaluation.</p>
702
  </div>
703
+ <div class="summary-pill"><strong>{len(suite['tasks'])}+{neural_count}</strong><span>min + NN tasks</span></div>
704
  </header>
705
  <section class="shared">
706
  <article><h2>Shared windows</h2><p>{suite['num_frames']:,} frames to {suite['num_windows']:,} windows over video, depth, pose, mocap, inertial, and language features.</p></article>
707
  <article><h2>Feature vector</h2><p>X_all is {suite['feature_dim']:,} dimensions with 17 named blocks; sample audio is documented but not featurized here.</p></article>
708
+ <article><h2>Reusable heads</h2><p>Minimal softmax/ridge/logistic heads plus optional PyTorch MLP heads cover the whole suite.</p></article>
709
+ <article><h2>Artifacts</h2><p>Metrics, predictions, model weights, neural checkpoints, manifests, and the source summary report are committed.</p></article>
710
  </section>
711
  <section class="families">{families}</section>
712
  <section class="task-groups">{"".join(group_html)}</section>
scripts/render_task_suite_infographic.py CHANGED
@@ -1,6 +1,6 @@
1
  #!/usr/bin/env python3
2
  """
3
- Render a polished 12-task Xperience-10M episode-suite infographic.
4
 
5
  The task names, inputs, and metrics are read from
6
  results/episode_task_suite/summary_report.json. The output is a deterministic
@@ -470,8 +470,17 @@ def short_io(task_name: str, metrics: dict) -> str:
470
  return custom.get(task_name, metrics.get("input", ""))
471
 
472
 
473
- def task_card(task_name: str, kind: str, metrics: dict, group: dict, index: int) -> str:
474
  label, value = metric_for(task_name, metrics)
 
 
 
 
 
 
 
 
 
475
  io = short_io(task_name, metrics)
476
  return f"""
477
  <article class="task-card" style="--accent:{group['color']};--soft:{group['soft']};">
@@ -482,9 +491,10 @@ def task_card(task_name: str, kind: str, metrics: dict, group: dict, index: int)
482
  <h3>{html.escape(task_name)}</h3>
483
  <p>{html.escape(io)}</p>
484
  <div class="metric">
485
- <span>{html.escape(label)}</span>
486
  <strong>{html.escape(value)}</strong>
487
  </div>
 
488
  </article>
489
  """
490
 
@@ -506,6 +516,7 @@ def modality_card(name: str, line_one: str, line_two: str, index: int, thumbnail
506
 
507
  def build_html(summary: dict, base_image: Path | None, sample_dir: Path | None) -> str:
508
  suite = summary["tasks"]
 
509
  thumbnails = load_sample_thumbnails(sample_dir)
510
  base_layer = ""
511
  if base_image is not None and base_image.exists():
@@ -514,7 +525,7 @@ def build_html(summary: dict, base_image: Path | None, sample_dir: Path | None)
514
  (f"{summary['num_frames']:,}", "frames"),
515
  (f"{summary['num_windows']:,}", "windows"),
516
  (f"{summary['feature_dim']:,}", "features"),
517
- (f"{len(suite)}", "tasks"),
518
  ("70/30", "chronological split"),
519
  ]
520
  stats_html = "".join(
@@ -531,7 +542,7 @@ def build_html(summary: dict, base_image: Path | None, sample_dir: Path | None)
531
  for group in GROUPS:
532
  cards = []
533
  for task_name, kind in group["tasks"]:
534
- cards.append(task_card(task_name, kind, suite[task_name], group, task_index))
535
  task_index += 1
536
  families.append(
537
  f"""
@@ -852,13 +863,18 @@ def build_html(summary: dict, base_image: Path | None, sample_dir: Path | None)
852
  display: inline-flex;
853
  align-items: baseline;
854
  gap: 10px;
855
- margin-top: 14px;
856
  min-height: 32px;
857
  padding: 7px 10px;
858
  border-radius: 8px;
859
  border: 1px solid color-mix(in srgb, var(--accent) 32%, #ffffff);
860
  background: rgba(255,255,255,0.82);
861
  }}
 
 
 
 
 
862
  .metric span {{
863
  color: #64748b;
864
  font-size: 13px;
@@ -897,14 +913,14 @@ def build_html(summary: dict, base_image: Path | None, sample_dir: Path | None)
897
  </style>
898
  </head>
899
  <body>
900
- <main class="canvas" aria-label="Xperience-10M 12-task episode suite infographic">
901
  {base_layer}
902
  <div class="content">
903
  <header class="header">
904
  <div>
905
  <div class="kicker">verified single-episode task suite</div>
906
- <h1>Xperience-10M 12-task episode suite</h1>
907
- <p class="subtitle">A clean map from synchronized multimodal windows to 12 auditable task heads, with metrics loaded from the committed summary report.</p>
908
  </div>
909
  <div class="stats">{stats_html}</div>
910
  </header>
@@ -922,7 +938,7 @@ def build_html(summary: dict, base_image: Path | None, sample_dir: Path | None)
922
  <div class="arrow">-></div>
923
  <div class="step"><strong>8,378-d vector</strong><span>current manifest excludes audio features</span></div>
924
  <div class="arrow">-></div>
925
- <div class="step"><strong>12 minimal heads</strong><span>softmax, ridge, logistic</span></div>
926
  </section>
927
 
928
  <section class="families">{''.join(families)}</section>
 
1
  #!/usr/bin/env python3
2
  """
3
+ Render a polished Ropedia Xperience-10M 12-task infographic.
4
 
5
  The task names, inputs, and metrics are read from
6
  results/episode_task_suite/summary_report.json. The output is a deterministic
 
470
  return custom.get(task_name, metrics.get("input", ""))
471
 
472
 
473
+ def task_card(task_name: str, kind: str, metrics: dict, group: dict, index: int, neural_metrics: dict | None = None) -> str:
474
  label, value = metric_for(task_name, metrics)
475
+ neural_html = ""
476
+ if neural_metrics and "error" not in neural_metrics:
477
+ neural_label, neural_value = metric_for(task_name, neural_metrics)
478
+ neural_html = f"""
479
+ <div class="metric neural">
480
+ <span>NN {html.escape(neural_label)}</span>
481
+ <strong>{html.escape(neural_value)}</strong>
482
+ </div>
483
+ """
484
  io = short_io(task_name, metrics)
485
  return f"""
486
  <article class="task-card" style="--accent:{group['color']};--soft:{group['soft']};">
 
491
  <h3>{html.escape(task_name)}</h3>
492
  <p>{html.escape(io)}</p>
493
  <div class="metric">
494
+ <span>min {html.escape(label)}</span>
495
  <strong>{html.escape(value)}</strong>
496
  </div>
497
+ {neural_html}
498
  </article>
499
  """
500
 
 
516
 
517
  def build_html(summary: dict, base_image: Path | None, sample_dir: Path | None) -> str:
518
  suite = summary["tasks"]
519
+ neural_suite = summary.get("neural_tasks", {})
520
  thumbnails = load_sample_thumbnails(sample_dir)
521
  base_layer = ""
522
  if base_image is not None and base_image.exists():
 
525
  (f"{summary['num_frames']:,}", "frames"),
526
  (f"{summary['num_windows']:,}", "windows"),
527
  (f"{summary['feature_dim']:,}", "features"),
528
+ (f"{len(suite)}+{len(neural_suite)}", "min + NN tasks"),
529
  ("70/30", "chronological split"),
530
  ]
531
  stats_html = "".join(
 
542
  for group in GROUPS:
543
  cards = []
544
  for task_name, kind in group["tasks"]:
545
+ cards.append(task_card(task_name, kind, suite[task_name], group, task_index, neural_suite.get(task_name)))
546
  task_index += 1
547
  families.append(
548
  f"""
 
863
  display: inline-flex;
864
  align-items: baseline;
865
  gap: 10px;
866
+ margin-top: 10px;
867
  min-height: 32px;
868
  padding: 7px 10px;
869
  border-radius: 8px;
870
  border: 1px solid color-mix(in srgb, var(--accent) 32%, #ffffff);
871
  background: rgba(255,255,255,0.82);
872
  }}
873
+ .metric.neural {{
874
+ margin-left: 8px;
875
+ border-color: rgba(31,36,33,0.18);
876
+ background: rgba(245,241,233,0.82);
877
+ }}
878
  .metric span {{
879
  color: #64748b;
880
  font-size: 13px;
 
913
  </style>
914
  </head>
915
  <body>
916
+ <main class="canvas" aria-label="Ropedia Xperience-10M 12-task suite infographic">
917
  {base_layer}
918
  <div class="content">
919
  <header class="header">
920
  <div>
921
  <div class="kicker">verified single-episode task suite</div>
922
+ <h1>Ropedia Xperience-10M 12-task suite</h1>
923
+ <p class="subtitle">A clean map from synchronized multimodal windows to 12 auditable task heads, comparing minimal heads with neural MLP results. Next TODO: Qwen3-Omni fine-tuning plus sensor-bridge evaluation.</p>
924
  </div>
925
  <div class="stats">{stats_html}</div>
926
  </header>
 
938
  <div class="arrow">-></div>
939
  <div class="step"><strong>8,378-d vector</strong><span>current manifest excludes audio features</span></div>
940
  <div class="arrow">-></div>
941
+ <div class="step"><strong>12 minimal + NN heads</strong><span>softmax/ridge/logistic plus PyTorch MLP</span></div>
942
  </section>
943
 
944
  <section class="families">{''.join(families)}</section>