Add files using upload-large-folder tool

4ee0c8c verified 29 days ago

46 kB

	---
	license: cc-by-4.0
	task_categories:
	- robotics
	- image-segmentation
	- graph-ml
	language:
	- en
	tags:
	- robotics
	- manipulation
	- disassembly
	- constraint-graph
	- gnn
	- world-model
	- sam2
	- segmentation
	- ur5e
	size_categories:
	- 1K<n<10K
	pretty_name: GNN Disassembly World Model Dataset (v3)
	---

	# GNN Disassembly World Model Dataset (v3)

	Real robot disassembly episodes with side-view per-frame constraint graphs, SAM2 segmentation masks, 256D feature embeddings, full 3D depth information (point clouds), and synchronized robot states. The robot is labeled as a separate agent node with its own mask, embedding, and depth bundle.

	Project: CoRL 2026 — GNN world model for constraint-aware video generation
	Author: Chang Liu (Texas A&M University)
	Hardware: UR5e + Robotiq 2F-85 gripper, OAK-D Pro (side view)
	Format version: v3 (2026-04-10)

	## Dataset Structure

	```
	episode_XX/
	├── metadata.json # Episode metadata, component counts, labeled frame count
	├── robot_states.npy # (T, 13) float32 — joint angles + TCP pose + gripper
	├── robot_actions.npy # (T-1, 13) float32 — frame-to-frame state deltas
	├── timestamps.npy # (T, 3) float64
	├── side/
	│ ├── rgb/frame_XXXXXX.png # 1280×720 RGB (side camera)
	│ └── depth/frame_XXXXXX.npy # 1280×720 uint16 depth (mm)
	├── wrist/ # Raw wrist camera data (not used in v3)
	│ ├── rgb/...
	│ └── depth/...
	└── annotations/
	├── side_graph.json # Constraint graph (products only, NO robot)
	├── side_masks/
	│ └── frame_XXXXXX.npz # {component_id: (H,W) uint8} — products only
	├── side_embeddings/
	│ └── frame_XXXXXX.npz # {component_id: (256,) float32} — products only
	├── side_depth_info/
	│ └── frame_XXXXXX.npz # Per-product depth bundle (flat keys)
	├── side_robot/
	│ └── frame_XXXXXX.npz # Robot bundle — ALWAYS written per labeled frame
	└── dataset_card.json # Format description
	```

	Alignment guarantee: every labeled frame has files in all 4 annotation directories. Files are aligned by frame index.

	## Component Types (9 types)

	8 product types (constraint nodes):

	\| Index \| Type \| Color \| Notes \|
	\|-------\|------\|-------\|-------\|
	\| 0 \| `cpu_fan` \| #FF6B6B \| Always visible at start \|
	\| 1 \| `cpu_bracket` \| #4ECDC4 \| Hidden at start (under fan) \|
	\| 2 \| `cpu` \| #45B7D1 \| Hidden at start \|
	\| 3 \| `ram_clip` \| #96CEB4 \| Multiple instances: ram_clip_1, ram_clip_2, ... \|
	\| 4 \| `ram` \| #FFEAA7 \| Multiple instances: ram_1, ram_2, ... \|
	\| 5 \| `connector` \| #DDA0DD \| Multiple instances: connector_1, connector_2, ... \|
	\| 6 \| `graphic_card` \| #FF8C42 \| Always visible \|
	\| 7 \| `motherboard` \| #8B5CF6 \| Always visible (base) \|

	1 agent type (NOT in constraint graph):

	\| Index \| Type \| Color \| Notes \|
	\|-------\|------\|-------\|-------\|
	\| 8 \| `robot` \| #F5F5F5 \| Labeled but stored separately. Added as agent node at training time. \|

	## Graph Semantics

	### Constraint Graph (Sparse, Stored)

	`side_graph.json` defines the physical constraint relationships between products. Directed edges: `A -> B` means "A must be removed before B can be removed" (A blocks B).

	```
	cpu_fan -> cpu_bracket (fan covers bracket)
	cpu_fan -> motherboard (fan attached to board)
	cpu_bracket -> cpu (bracket holds CPU)
	cpu_bracket -> motherboard
	cpu -> motherboard
	ram_N -> motherboard
	ram_clip_N -> motherboard
	ram_clip_N -> ram_M (user manually pairs)
	connector_N -> motherboard
	graphic_card -> motherboard
	```

	Edge states are delta-encoded in `frame_states`:
	- `locked: true` (1) — constraint active, component cannot be removed
	- `locked: false` (0) — constraint released, component is free
	- Monotonic: once unlocked, stays unlocked

	### Fully Connected Graph (Built at Training Time)

	For GNN message passing, the sparse constraint graph is expanded to a fully connected directed graph. Every ordered pair `(i, j)` where `i != j` gets an edge. Self-loops are excluded.

	Edge count: For a graph with N nodes, there are N × (N - 1) directed edges (both directions for every pair).

	Edge features (2D):

	\| `has_constraint` \| `is_locked` \| Meaning \|
	\|---\|---\|---\|
	\| 1 \| 1 \| Directed physical constraint `i → j` exists, currently active (locked) \|
	\| 1 \| 0 \| Directed physical constraint `i → j` exists, released (unlocked) \|
	\| 0 \| 0 \| No physical constraint in this direction — message passing only \|

	Direction handling is asymmetric. The physical constraint `A → B` (A blocks B's removal) is a one-way relationship:
	- Edge `(A, B)` → `has_constraint = 1`
	- Edge `(B, A)` → `has_constraint = 0` (no reverse constraint; still present for message passing)

	For example, if `cpu_fan → cpu_bracket` is a constraint:
	```
	(cpu_fan, cpu_bracket) → has_constraint=1, is_locked=1 (physical, active)
	(cpu_bracket, cpu_fan) → has_constraint=0, is_locked=0 (message passing only)
	```

	This ensures every node pair communicates during GNN layers while still encoding the directionality of the prerequisite relationship.

	Robot (agent node) has NO physical constraints. All edges involving the robot (`robot ↔ any_product`) have features `[0, 0]` — context-passing only.

	Node ordering: Node indices in `edge_index` match the order of `components` in `side_graph.json`. When the robot is added (with `load_pyg_frame_with_robot`), it is appended at index `N_products` (the last position).

	## Data File Schemas

	### `side_graph.json`

	```json
	{
	"view": "side",
	"episode_id": "episode_00",
	"goal_component": "connector_1",
	"components": [
	{"id": "cpu_fan", "type": "cpu_fan", "color": "#FF6B6B"},
	{"id": "ram_1", "type": "ram", "color": "#FFEAA7"}
	],
	"edges": [
	{"src": "cpu_fan", "dst": "cpu_bracket", "directed": true},
	{"src": "ram_clip_1", "dst": "ram_1", "directed": true}
	],
	"frame_states": {
	"0": {
	"constraints": {"cpu_fan->cpu_bracket": true},
	"visibility": {"cpu_bracket": false, "cpu": false, "robot": true}
	},
	"152": {
	"constraints": {"cpu_fan->cpu_bracket": false},
	"visibility": {"cpu_fan": false, "cpu_bracket": true, "cpu": true}
	}
	},
	"node_positions": {"cpu_fan": [120, 80]},
	"embedding_dim": 256,
	"feature_extractor": "sam2.1_hiera_base_plus",
	"type_vocab": ["cpu_fan", "cpu_bracket", "cpu", "ram_clip", "ram", "connector", "graphic_card", "motherboard", "robot"]
	}
	```

	Robot is NOT in components. Robot is stored in `side_robot/`.

	### `side_depth_info/frame_XXXXXX.npz`

	Always contains all 7 keys per component in `graph.components`. Flat keys prefixed by component_id.

	\| Key \| Shape \| Dtype \| Description \|
	\|-----\|-------\|-------\|-------------\|
	\| `{cid}_point_cloud` \| (N, 3) \| float32 \| 3D points in camera frame (meters). Empty (0, 3) if no valid depth. \|
	\| `{cid}_pixel_coords` \| (N, 2) \| int32 \| (u, v) pixel coords of valid points \|
	\| `{cid}_raw_depths_mm` \| (N,) \| uint16 \| Raw depth values in mm, filtered to [50, 2000] \|
	\| `{cid}_centroid` \| (3,) \| float32 \| Mean of point_cloud; [0,0,0] if no valid depth \|
	\| `{cid}_bbox_2d` \| (4,) \| int32 \| [x1, y1, x2, y2] from mask \|
	\| `{cid}_area` \| (1,) \| int32 \| Mask pixel count \|
	\| `{cid}_depth_valid` \| (1,) \| uint8 \| 1 if N > 0 else 0 \|

	### `side_robot/frame_XXXXXX.npz`

	Always written per labeled frame (with `visible=[0]` if robot not in this frame).

	\| Key \| Shape \| Dtype \| Description \|
	\|-----\|-------\|-------\|-------------\|
	\| `visible` \| (1,) \| uint8 \| 1 if robot labeled, 0 otherwise \|
	\| `mask` \| (H, W) \| uint8 \| Binary mask \|
	\| `embedding` \| (256,) \| float32 \| SAM2 256D feature \|
	\| `point_cloud` \| (N, 3) \| float32 \| 3D points (meters) \|
	\| `pixel_coords` \| (N, 2) \| int32 \| (u, v) pixel coords \|
	\| `raw_depths_mm` \| (N,) \| uint16 \| Raw depths in mm \|
	\| `centroid` \| (3,) \| float32 \| Mean of point_cloud \|
	\| `bbox_2d` \| (4,) \| int32 \| From mask \|
	\| `area` \| (1,) \| int32 \| Pixel count \|
	\| `depth_valid` \| (1,) \| uint8 \| 1 if N > 0 else 0 \|

	### `metadata.json`

	```json
	{
	"episode_id": "episode_00",
	"goal_component": "connector_1",
	"num_frames": 604,
	"labeled_frame_count": 246,
	"annotation_complete": false,
	"component_counts": {
	"cpu_fan": 1, "cpu_bracket": 1, "cpu": 1,
	"ram": 2, "ram_clip": 4, "connector": 4,
	"graphic_card": 1, "motherboard": 1
	},
	"format_version": "3.0",
	"sam2_model": "sam2.1_hiera_b+",
	"embedding_dim": 256,
	"fps": 30,
	"cameras": ["side"],
	"robot": "UR5e",
	"gripper": "Robotiq 2F-85"
	}
	```

	## Test Data Available

	One episode is fully labeled and validated — you can use it to test the loader:

	Labeled episode: `session_0408_162129/episode_00`

	\| Stat \| Value \|
	\|------\|-------\|
	\| Total frames in episode \| 604 \|
	\| Labeled frames \| 346 (range 0–351, 6 gaps) \|
	\| Product components \| 15 (cpu_fan, cpu_bracket, cpu, graphic_card, motherboard, connector_1..4, ram_1..2, ram_clip_1..4) \|
	\| Physical constraints (edges) \| 14 \|
	\| Robot visibility \| Visible in 216 / 346 frames \|
	\| Goal component \| `connector_1` \|

	### Download and Test (3 steps)

	Step 1: Download just one episode (lightweight)

	```bash
	pip install huggingface_hub
	```

	```python
	from huggingface_hub import snapshot_download

	local_dir = snapshot_download(
	repo_id="ChangChrisLiu/GNN_Disassembly_WorldModel",
	repo_type="dataset",
	allow_patterns=[
	"session_0408_162129/episode_00/metadata.json",
	"session_0408_162129/episode_00/robot_states.npy",
	"session_0408_162129/episode_00/robot_actions.npy",
	"session_0408_162129/episode_00/side/rgb/frame_000042.png",
	"session_0408_162129/episode_00/side/depth/frame_000042.npy",
	"session_0408_162129/episode_00/annotations/*",
	],
	)
	print("Downloaded to:", local_dir)
	```

	Step 2: Save the loader code (copy the self-contained `gnn_disassembly_loader.py` block below into a file)

	Step 3: Run this test script — it loads frame 42, prints the full graph anatomy, and verifies everything:

	```python
	from pathlib import Path
	from gnn_disassembly_loader import (
	load_pyg_frame_products_only,
	load_pyg_frame_with_robot,
	list_labeled_frames,
	load_frame_data,
	)

	# After snapshot_download above:
	episode = Path(local_dir) / "session_0408_162129" / "episode_00"

	# 1. Enumerate labeled frames
	frames = list_labeled_frames(episode)
	assert len(frames) == 346, f"Expected 346 labeled frames, got {len(frames)}"
	print(f"✓ Labeled frames: {len(frames)} (range {frames[0]}..{frames[-1]})")

	# 2. Load frame 42 — products only
	data1 = load_pyg_frame_products_only(episode, frame_idx=42)
	assert data1.num_nodes == 15, f"Expected 15 products, got {data1.num_nodes}"
	assert data1.edge_index.shape[1] == 15 * 14 # fully connected
	assert data1.edge_attr.shape == (210, 3) # 3D edge features
	print(f"✓ Products-only: {data1}")

	# 3. Load frame 42 — with robot agent
	data2 = load_pyg_frame_with_robot(episode, frame_idx=42)
	assert data2.num_nodes == 16, f"Expected 15 products + 1 robot = 16, got {data2.num_nodes}"
	assert data2.edge_index.shape[1] == 16 * 15
	assert hasattr(data2, "robot_point_cloud")
	print(f"✓ With robot: {data2}")
	print(f" Robot point cloud: {tuple(data2.robot_point_cloud.shape)}")
	print(f" Robot mask: {tuple(data2.robot_mask.shape)}")

	# 4. Verify robot edges are all [0, 0, 0]
	robot_idx = data2.num_nodes - 1
	robot_edges = (data2.edge_index[0] == robot_idx) \| (data2.edge_index[1] == robot_idx)
	assert (data2.edge_attr[robot_edges] == 0).all()
	print(f"✓ Robot edges: {robot_edges.sum().item()} — all [0,0,0]")

	# 5. Verify edge feature semantics
	has_c = (data1.edge_attr[:, 0] == 1).sum().item()
	locked = ((data1.edge_attr[:, 0] == 1) & (data1.edge_attr[:, 1] == 1)).sum().item()
	src_blocks = ((data1.edge_attr[:, 0] == 1) & (data1.edge_attr[:, 2] == 1)).sum().item()
	assert has_c == 28 # 14 constraints × 2 directions
	assert locked == 28 # all locked at frame 42
	assert src_blocks == 14 # half the constraint edges have src as blocker
	print(f"✓ Edge features: {has_c} constraint edges, {locked} locked, {src_blocks} forward-direction")

	# 6. Verify fully-connected + symmetric structure
	from collections import Counter
	pairs = Counter()
	for i in range(data1.edge_index.shape[1]):
	src = data1.edge_index[0, i].item()
	dst = data1.edge_index[1, i].item()
	pairs[frozenset([src, dst])] += 1
	# Every unordered pair should appear exactly twice: (i, j) AND (j, i)
	assert all(count == 2 for count in pairs.values())
	print(f"✓ Structurally symmetric: every pair has both directions")

	# 7. Raw data access
	fd = load_frame_data(episode, frame_idx=42)
	print(f"✓ Raw data: {len(fd.masks)} product masks, robot {'visible' if fd.robot else 'hidden'}")

	print("\nAll tests passed! The dataset is ready for training.")
	```

	Expected output:
	```
	✓ Labeled frames: 346 (range 0..351)
	✓ Products-only: Data(x=[15, 269], edge_index=[2, 210], edge_attr=[210, 3], y=[1], num_nodes=15)
	✓ With robot: Data(x=[16, 269], edge_index=[2, 240], edge_attr=[240, 3], y=[1], num_nodes=16, robot_point_cloud=[5729, 3], robot_pixel_coords=[5729, 2], robot_mask=[720, 1280])
	Robot point cloud: (5729, 3)
	Robot mask: (720, 1280)
	✓ Robot edges: 30 — all [0,0,0]
	✓ Edge features: 28 constraint edges, 28 locked, 14 forward-direction
	✓ Structurally symmetric: every pair has both directions
	✓ Raw data: 13 product masks, robot visible

	All tests passed! The dataset is ready for training.
	```

	## Graph Structure — What You Get Per Frame

	Every labeled frame is converted to one PyTorch Geometric `Data` object. Here's exactly what it contains:

	### Node Features (269D per node)

	```
	┌───────────────────────────────────────────────────────────────────────┐
	│ x[i] = 269D feature vector for node i │
	├───────────────────────────────────────────────────────────────────────┤
	│ [0 : 256] SAM2 embedding (256D) │
	│ Masked average pool over SAM2 encoder's vision_features. │
	│ Captures visual appearance of the component. │
	├───────────────────────────────────────────────────────────────────────┤
	│ [256 : 259] 3D position (3D) │
	│ Centroid in camera frame, meters. Mean of the valid │
	│ depth-backprojected points within the mask. │
	│ Zero vector if no valid depth (check depth_valid flag). │
	├───────────────────────────────────────────────────────────────────────┤
	│ [259 : 268] Type one-hot (9D) │
	│ Index order: cpu_fan, cpu_bracket, cpu, ram_clip, ram, │
	│ connector, graphic_card, motherboard, robot. │
	│ Multiple instances (e.g. ram_1, ram_2) share the same │
	│ one-hot — distinguished by their SAM2 embedding + 3D pos.│
	├───────────────────────────────────────────────────────────────────────┤
	│ [268] Visibility (1D) │
	│ Binary flag — 1 if visible this frame, 0 if hidden. │
	│ Delta-encoded through frame_states in side_graph.json. │
	└───────────────────────────────────────────────────────────────────────┘
	```

	### Graph Topology — Fully Connected, Structurally Symmetric

	For N nodes, the PyG graph has:
	- `edge_index` shape: (2, N × (N − 1))
	- Every ordered pair `(i, j)` with `i ≠ j` has an edge
	- Both `(i, j)` AND `(j, i)` exist — the graph is not structurally directed
	- Self-loops are excluded

	Why fully connected? Sparse constraint graphs (just physical prerequisites) would prevent distant nodes from exchanging information through GNN message passing. Making it fully connected ensures every node pair communicates in one layer.

	### Edge Features (3D per edge)

	```
	┌─────────────────┬──────────┬────────────────┬─────────────────────────┐
	│ has_constraint │ is_locked│ src_blocks_dst │ Meaning │
	├─────────────────┼──────────┼────────────────┼─────────────────────────┤
	│ 0 │ 0 │ 0 │ No physical constraint │
	│ │ │ │ (message passing only) │
	├─────────────────┼──────────┼────────────────┼─────────────────────────┤
	│ 1 │ 1 │ 1 │ Physical constraint │
	│ │ │ │ LOCKED, src is blocker │
	├─────────────────┼──────────┼────────────────┼─────────────────────────┤
	│ 1 │ 1 │ 0 │ Physical constraint │
	│ │ │ │ LOCKED, src is blocked │
	│ │ │ │ (reverse direction) │
	├─────────────────┼──────────┼────────────────┼─────────────────────────┤
	│ 1 │ 0 │ 1 │ Physical constraint │
	│ │ │ │ RELEASED (unlocked) │
	│ │ │ │ src is the blocker │
	├─────────────────┼──────────┼────────────────┼─────────────────────────┤
	│ 1 │ 0 │ 0 │ Physical constraint │
	│ │ │ │ RELEASED, src is blocked│
	└─────────────────┴──────────┴────────────────┴─────────────────────────┘
	```

	Direction is a feature, not structure.
	- `has_constraint` and `is_locked` describe the PAIR — they're the same for both `(i,j)` and `(j,i)`.
	- `src_blocks_dst` is asymmetric: it flips depending on which direction the edge goes.

	Example: `cpu_fan` blocks `cpu_bracket` (fan covers bracket). At frame 0 (locked):

	```
	edge (cpu_fan, cpu_bracket) → [1, 1, 1] cpu_fan is the blocker
	edge (cpu_bracket, cpu_fan) → [1, 1, 0] cpu_bracket is the blocked
	```

	At frame 152 after the user removes the fan (unlocked):

	```
	edge (cpu_fan, cpu_bracket) → [1, 0, 1]
	edge (cpu_bracket, cpu_fan) → [1, 0, 0]
	```

	### Robot Agent Node (Optional)

	When loaded with `load_pyg_frame_with_robot()`, the robot is appended as the last node (index `N_products`). All edges involving the robot have features `[0, 0, 0]` — the robot has no physical constraints, it's a context-providing agent node.

	The raw robot data (point cloud, pixel coords, full mask) is attached as extra tensors on the `Data` object for optional PointNet-style encoding.

	### Matching a Frame to Its RGB Image

	Frame indices in the loader directly map to image files:

	```python
	frame_idx = 42
	rgb_path = episode / "side" / "rgb" / f"frame_{frame_idx:06d}.png"
	depth_path = episode / "side" / "depth" / f"frame_{frame_idx:06d}.npy"
	```

	Example — load PyG frame + matching image + depth:

	```python
	from pathlib import Path
	import numpy as np
	from PIL import Image
	from gnn_disassembly_loader import load_pyg_frame_with_robot

	episode = Path("episode_00")
	frame_idx = 42

	# PyG graph for this frame
	data = load_pyg_frame_with_robot(episode, frame_idx)

	# Matching RGB image (1280x720 PNG)
	rgb = np.array(Image.open(episode / "side" / "rgb" / f"frame_{frame_idx:06d}.png"))
	print("RGB shape:", rgb.shape) # (720, 1280, 3)

	# Matching depth (1280x720 uint16 mm)
	depth = np.load(episode / "side" / "depth" / f"frame_{frame_idx:06d}.npy")
	print("Depth shape:", depth.shape, depth.dtype) # (720, 1280) uint16

	# Robot mask is in the PyG data if robot is visible
	if hasattr(data, "robot_mask"):
	robot_mask = data.robot_mask.numpy() # (720, 1280) uint8
	print("Robot mask area:", robot_mask.sum(), "pixels")
	```

	## Loading the Data — PyTorch Geometric

	This section contains self-contained code you can copy-paste directly. No need to clone any repo.

	### Prerequisites

	```bash
	pip install torch numpy torch_geometric pillow
	```

	### Self-contained PyG loader

	Copy this into a file called `gnn_disassembly_loader.py`:

	```python
	"""Self-contained PyG loader for the GNN Disassembly dataset.

	Two loader variants:
	- load_pyg_frame_products_only(ep, frame) → constraint graph only, no robot
	- load_pyg_frame_with_robot(ep, frame) → constraint graph + robot agent node

	Both return torch_geometric.data.Data with:
	x (N, 268) node features
	edge_index (2, N*(N-1)) fully connected directed message-passing edges
	edge_attr (N*(N-1), 3) [has_constraint, is_locked, src_blocks_dst]
	num_nodes N

	Notes on the edge feature design:
	- The graph is FULLY CONNECTED and structurally symmetric.
	Both (i, j) and (j, i) exist in edge_index for every node pair i != j.
	- Direction is NOT encoded in the graph structure. It is encoded as
	a feature: `src_blocks_dst`.
	- `has_constraint` and `is_locked` are symmetric per pair (same value
	for both (i, j) and (j, i)).
	- `src_blocks_dst` is asymmetric: it is 1 if the edge's src node
	physically blocks its dst node, 0 otherwise.
	"""

	import json
	from dataclasses import dataclass
	from pathlib import Path
	from typing import Dict, List, Optional, Tuple

	import numpy as np
	import torch
	from torch_geometric.data import Data


	# ─────────────────────────────────────────────────────────────────────────────
	# Helpers
	# ─────────────────────────────────────────────────────────────────────────────

	def list_labeled_frames(episode_dir: Path) -> List[int]:
	"""Return sorted list of frame indices that have saved annotations."""
	mask_dir = episode_dir / "annotations" / "side_masks"
	if not mask_dir.exists():
	return []
	frames = []
	for p in mask_dir.glob("frame_*.npz"):
	try:
	frames.append(int(p.stem.split("_")[1]))
	except (ValueError, IndexError):
	continue
	return sorted(frames)


	def resolve_frame_state(graph_json: dict, frame_idx: int) -> Tuple[Dict[str, bool], Dict[str, bool]]:
	"""Resolve delta-encoded constraints + visibility at a frame.

	Walks frame_states from frame 0 to frame_idx, accumulating deltas.
	Returns (constraints_dict, visibility_dict).
	"""
	constraints: Dict[str, bool] = {}
	visibility: Dict[str, bool] = {}
	# Defaults: every component visible, every edge locked
	for c in graph_json["components"]:
	visibility[c["id"]] = True
	for e in graph_json["edges"]:
	constraints[f"{e['src']}->{e['dst']}"] = True
	# Walk deltas up to frame_idx
	fs_dict = graph_json.get("frame_states", {})
	for f in sorted([int(k) for k in fs_dict]):
	if f > frame_idx:
	break
	fs = fs_dict[str(f)]
	for k, v in fs.get("constraints", {}).items():
	constraints[k] = v
	for k, v in fs.get("visibility", {}).items():
	visibility[k] = v
	return constraints, visibility


	def type_one_hot(comp_type: str, type_vocab: List[str]) -> List[float]:
	"""9-dim one-hot encoding of component type based on type_vocab."""
	return [1.0 if t == comp_type else 0.0 for t in type_vocab]


	# ─────────────────────────────────────────────────────────────────────────────
	# Raw data loader (NumPy only, no torch)
	# ─────────────────────────────────────────────────────────────────────────────

	@dataclass
	class FrameData:
	graph: dict
	masks: Dict[str, np.ndarray]
	embeddings: Dict[str, np.ndarray]
	depth_info: dict
	robot: Optional[dict]
	constraints: Dict[str, bool]
	visibility: Dict[str, bool]


	def load_frame_data(episode_dir: Path, frame_idx: int) -> FrameData:
	"""Load all v3 annotation files for one frame."""
	anno = episode_dir / "annotations"

	with open(anno / "side_graph.json") as f:
	graph = json.load(f)

	def _load_npz_dict(path: Path) -> Dict[str, np.ndarray]:
	if not path.exists():
	return {}
	d = np.load(path)
	return {k: d[k] for k in d.files}

	masks = _load_npz_dict(anno / "side_masks" / f"frame_{frame_idx:06d}.npz")
	embeddings = _load_npz_dict(anno / "side_embeddings" / f"frame_{frame_idx:06d}.npz")
	depth_info = _load_npz_dict(anno / "side_depth_info" / f"frame_{frame_idx:06d}.npz")

	robot: Optional[dict] = None
	robot_path = anno / "side_robot" / f"frame_{frame_idx:06d}.npz"
	if robot_path.exists():
	r = np.load(robot_path)
	if r["visible"][0] == 1:
	robot = {k: r[k] for k in r.files}

	constraints, visibility = resolve_frame_state(graph, frame_idx)
	return FrameData(graph, masks, embeddings, depth_info, robot, constraints, visibility)


	# ─────────────────────────────────────────────────────────────────────────────
	# PyG loader — products only
	# ─────────────────────────────────────────────────────────────────────────────

	def load_pyg_frame_products_only(episode_dir: Path, frame_idx: int) -> Data:
	"""Fully connected PyG graph WITHOUT robot.

	Returns Data(
	x=[N, 268],
	edge_index=[2, N*(N-1)],
	edge_attr=[N*(N-1), 3], # [has_constraint, is_locked, src_blocks_dst]
	num_nodes=N,
	)
	where N = number of product components (robot excluded).
	"""
	fd = load_frame_data(episode_dir, frame_idx)
	graph = fd.graph
	type_vocab = graph["type_vocab"] # 9 entries incl. robot
	nodes = graph["components"] # robot already excluded per spec
	N = len(nodes)

	# ── Node features ──
	# [256D SAM2 embedding, 3D position, 9D type one-hot, 1D visibility] = 269
	# NOTE: 256 + 3 + 9 + 1 = 269 (not 268). Adjust if you need a different layout.
	x_list = []
	for node in nodes:
	cid = node["id"]
	emb = fd.embeddings.get(cid, np.zeros(256, dtype=np.float32))

	depth_valid_key = f"{cid}_depth_valid"
	centroid_key = f"{cid}_centroid"
	if (depth_valid_key in fd.depth_info
	and int(fd.depth_info[depth_valid_key][0]) == 1):
	pos = fd.depth_info[centroid_key].astype(np.float32)
	else:
	pos = np.zeros(3, dtype=np.float32)

	type_oh = type_one_hot(node["type"], type_vocab) # 9D
	vis = 1.0 if fd.visibility.get(cid, True) else 0.0

	feat = np.concatenate([
	emb.astype(np.float32),
	pos,
	np.array(type_oh, dtype=np.float32),
	np.array([vis], dtype=np.float32),
	])
	x_list.append(feat)
	x = torch.tensor(np.stack(x_list), dtype=torch.float32) if x_list else torch.empty((0, 269))

	# ── Fully connected edges with 3D features ──
	# Edge feature: [has_constraint, is_locked, src_blocks_dst]
	# - has_constraint & is_locked are SYMMETRIC for the pair (A, B)
	# - src_blocks_dst is ASYMMETRIC: 1 if edge's src physically blocks dst
	constraint_set = {(e["src"], e["dst"]) for e in graph["edges"]}
	pair_forward = {} # frozenset({a, b}) -> (blocker, blocked)
	for (s, d) in constraint_set:
	pair_forward[frozenset([s, d])] = (s, d)

	src_idx, dst_idx, edge_attr = [], [], []
	for i in range(N):
	for j in range(N):
	if i == j:
	continue
	src_id = nodes[i]["id"]
	dst_id = nodes[j]["id"]
	src_idx.append(i)
	dst_idx.append(j)

	pair_key = frozenset([src_id, dst_id])
	if pair_key in pair_forward:
	forward = pair_forward[pair_key]
	constraint_key = f"{forward[0]}->{forward[1]}"
	is_locked = fd.constraints.get(constraint_key, True)
	src_blocks_dst = 1.0 if src_id == forward[0] else 0.0
	edge_attr.append([
	1.0,
	1.0 if is_locked else 0.0,
	src_blocks_dst,
	])
	else:
	edge_attr.append([0.0, 0.0, 0.0]) # message passing only

	return Data(
	x=x,
	edge_index=torch.tensor([src_idx, dst_idx], dtype=torch.long),
	edge_attr=torch.tensor(edge_attr, dtype=torch.float32),
	y=torch.tensor([frame_idx], dtype=torch.long),
	num_nodes=N,
	)


	# ─────────────────────────────────────────────────────────────────────────────
	# PyG loader — with robot agent node
	# ─────────────────────────────────────────────────────────────────────────────

	def load_pyg_frame_with_robot(episode_dir: Path, frame_idx: int) -> Data:
	"""Fully connected PyG graph WITH robot appended as agent node.

	Robot is node N (the last node). All edges involving the robot have
	features [0, 0, 0] because the robot has no physical constraints.

	If the robot is not visible at this frame, returns the products-only graph.
	Additional attached tensors when robot is visible:
	data.robot_point_cloud (M, 3) float32
	data.robot_pixel_coords (M, 2) int32
	data.robot_mask (H, W) uint8
	"""
	data = load_pyg_frame_products_only(episode_dir, frame_idx)
	fd = load_frame_data(episode_dir, frame_idx)
	if fd.robot is None:
	return data

	graph = fd.graph
	type_vocab = graph["type_vocab"]
	products = graph["components"]
	N_prod = len(products)
	N = N_prod + 1

	# ── Build robot node features ──
	robot_emb = fd.robot["embedding"].astype(np.float32)
	robot_pos = (fd.robot["centroid"].astype(np.float32)
	if int(fd.robot["depth_valid"][0]) == 1
	else np.zeros(3, dtype=np.float32))
	robot_type_oh = type_one_hot("robot", type_vocab)
	robot_feat = np.concatenate([
	robot_emb, robot_pos,
	np.array(robot_type_oh, dtype=np.float32),
	np.array([1.0], dtype=np.float32),
	])
	x = torch.cat([data.x, torch.tensor(robot_feat, dtype=torch.float32).unsqueeze(0)], dim=0)

	# ── Rebuild edges with 3D features ──
	constraint_set = {(e["src"], e["dst"]) for e in graph["edges"]}
	pair_forward = {}
	for (s, d) in constraint_set:
	pair_forward[frozenset([s, d])] = (s, d)

	src_idx, dst_idx, edge_attr = [], [], []

	# Products × Products
	for i in range(N_prod):
	for j in range(N_prod):
	if i == j:
	continue
	src_id = products[i]["id"]
	dst_id = products[j]["id"]
	src_idx.append(i)
	dst_idx.append(j)
	pair_key = frozenset([src_id, dst_id])
	if pair_key in pair_forward:
	forward = pair_forward[pair_key]
	is_locked = fd.constraints.get(f"{forward[0]}->{forward[1]}", True)
	src_blocks_dst = 1.0 if src_id == forward[0] else 0.0
	edge_attr.append([1.0, 1.0 if is_locked else 0.0, src_blocks_dst])
	else:
	edge_attr.append([0.0, 0.0, 0.0])

	# Robot ↔ Products (both directions, message-passing only)
	robot_idx = N_prod
	for i in range(N_prod):
	src_idx.append(robot_idx); dst_idx.append(i); edge_attr.append([0.0, 0.0, 0.0])
	src_idx.append(i); dst_idx.append(robot_idx); edge_attr.append([0.0, 0.0, 0.0])

	data = Data(
	x=x,
	edge_index=torch.tensor([src_idx, dst_idx], dtype=torch.long),
	edge_attr=torch.tensor(edge_attr, dtype=torch.float32),
	y=torch.tensor([frame_idx], dtype=torch.long),
	num_nodes=N,
	)
	data.robot_point_cloud = torch.tensor(fd.robot["point_cloud"], dtype=torch.float32)
	data.robot_pixel_coords = torch.tensor(fd.robot["pixel_coords"], dtype=torch.int32)
	data.robot_mask = torch.tensor(fd.robot["mask"], dtype=torch.uint8)
	return data


	# ─────────────────────────────────────────────────────────────────────────────
	# Episode iterator
	# ─────────────────────────────────────────────────────────────────────────────

	def iterate_episode(episode_dir: Path, with_robot: bool = True):
	"""Yield (frame_idx, Data) pairs for all labeled frames in an episode."""
	loader = load_pyg_frame_with_robot if with_robot else load_pyg_frame_products_only
	for frame_idx in list_labeled_frames(episode_dir):
	yield frame_idx, loader(episode_dir, frame_idx)
	```

	### Usage Examples

	#### Variant 1: Constraint Graph Only (No Robot)

	```python
	from pathlib import Path
	from gnn_disassembly_loader import load_pyg_frame_products_only, list_labeled_frames

	episode = Path("episode_00") # downloaded from HF

	# Enumerate labeled frames
	frames = list_labeled_frames(episode)
	print(f"Episode has {len(frames)} labeled frames")
	# → Episode has 246 labeled frames

	# Load one frame as a fully connected PyG graph (products only)
	data = load_pyg_frame_products_only(episode, frame_idx=42)
	print(data)
	# → Data(x=[15, 269], edge_index=[2, 210], edge_attr=[210, 3], y=[1], num_nodes=15)

	# For N=15 products: edges = 15 * 14 = 210 (fully connected)
	print("Node features:", data.x.shape) # (15, 269)
	print("Edges:", data.edge_index.shape) # (2, 210)
	print("Edge attrs:", data.edge_attr.shape) # (210, 3) = [has_constraint, is_locked, src_blocks_dst]

	# Count edge feature breakdown
	has_c = (data.edge_attr[:, 0] == 1).sum().item()
	locked = ((data.edge_attr[:, 0] == 1) & (data.edge_attr[:, 1] == 1)).sum().item()
	src_blocks = ((data.edge_attr[:, 0] == 1) & (data.edge_attr[:, 2] == 1)).sum().item()
	print(f"Edges with physical constraint: {has_c}")
	print(f" currently locked: {locked}")
	print(f" where src is the blocker: {src_blocks}")
	print(f"Message-passing-only edges: {(data.edge_attr[:, 0] == 0).sum().item()}")
	```

	#### Variant 2: Constraint Graph + Robot Agent Node

	```python
	from gnn_disassembly_loader import load_pyg_frame_with_robot

	data = load_pyg_frame_with_robot(episode, frame_idx=42)
	print(data)
	# → Data(x=[16, 269], edge_index=[2, 240], edge_attr=[240, 3], y=[1], num_nodes=16)
	# Robot is the last node (index 15 for a 15-product graph).
	# Robot edges: 15 products * 2 directions = 30 extra edges → 210 + 30 = 240

	# Verify robot edges are all message-passing (no constraint)
	robot_idx = data.num_nodes - 1
	robot_edges = (data.edge_index[0] == robot_idx) \| (data.edge_index[1] == robot_idx)
	assert (data.edge_attr[robot_edges] == 0).all(), "Robot edges must be [0, 0, 0]"
	print(f"Robot edges: {robot_edges.sum().item()} — all [0, 0, 0]")

	# Raw robot data (optional, for PointNet-style encoding)
	print("Robot point cloud:", data.robot_point_cloud.shape) # (M, 3) — M varies per frame
	print("Robot mask:", data.robot_mask.shape) # (720, 1280)
	```

	#### Edge Feature Semantics

	Each row of `data.edge_attr` is 3-dimensional: `[has_constraint, is_locked, src_blocks_dst]`.

	```
	┌──────────────────┬────────────┬────────────────┬─────────────────────────────────┐
	│ has_constraint │ is_locked │ src_blocks_dst │ Meaning │
	├──────────────────┼────────────┼────────────────┼─────────────────────────────────┤
	│ 0 │ 0 │ 0 │ No physical constraint │
	│ │ │ │ Message passing only │
	├──────────────────┼────────────┼────────────────┼─────────────────────────────────┤
	│ 1 │ 1 │ 1 │ Edge src physically blocks dst │
	│ │ │ │ Constraint currently LOCKED │
	├──────────────────┼────────────┼────────────────┼─────────────────────────────────┤
	│ 1 │ 1 │ 0 │ Edge dst physically blocks src │
	│ │ │ │ (the reverse direction of the │
	│ │ │ │ physical constraint) │
	│ │ │ │ Constraint currently LOCKED │
	├──────────────────┼────────────┼────────────────┼─────────────────────────────────┤
	│ 1 │ 0 │ 1 │ Edge src physically blocks dst │
	│ │ │ │ Constraint RELEASED │
	├──────────────────┼────────────┼────────────────┼─────────────────────────────────┤
	│ 1 │ 0 │ 0 │ Edge dst physically blocks src │
	│ │ │ │ Constraint RELEASED │
	└──────────────────┴────────────┴────────────────┴─────────────────────────────────┘
	```

	Important: The graph is fully connected and structurally symmetric — both `(A, B)` and `(B, A)` edges exist for every pair. `has_constraint` and `is_locked` are the same for both directions (they describe the unordered pair). `src_blocks_dst` flips between the two directions — it tells you whether the edge's source is the one doing the blocking.

	Example: CPU bracket blocks CPU removal

	If `cpu_bracket → cpu` is an active constraint, the loader produces:

	```
	Edge (cpu_bracket, cpu): [1, 1, 1] # cpu_bracket blocks cpu, locked, src=blocker
	Edge (cpu, cpu_bracket): [1, 1, 0] # same physical pair, src=blocked
	```

	When the user unlocks the constraint (e.g., after releasing the bracket):
	```
	Edge (cpu_bracket, cpu): [1, 0, 1] # constraint released, but bracket still named as blocker
	Edge (cpu, cpu_bracket): [1, 0, 0]
	```

	### Iterating the Full Episode

	```python
	from torch_geometric.loader import DataLoader
	from gnn_disassembly_loader import iterate_episode

	# Build a dataset list
	data_list = [data for _, data in iterate_episode(episode, with_robot=True)]
	print(f"Loaded {len(data_list)} frames")

	# Batch them for training
	loader = DataLoader(data_list, batch_size=8, shuffle=True)
	for batch in loader:
	print(batch.x.shape, batch.edge_index.shape, batch.edge_attr.shape)
	break
	```

	### Adding Robot State as Node Features (Graph B)

	For the perception + robot state variant, concatenate the 13D robot state to every node:

	```python
	import numpy as np
	import torch

	robot_states = np.load(episode / "robot_states.npy") # (T, 13)

	def add_robot_state_to_graph(data, frame_idx, robot_states):
	robot_state_t = torch.tensor(robot_states[frame_idx], dtype=torch.float32) # (13,)
	broadcast = robot_state_t.unsqueeze(0).expand(data.num_nodes, -1) # (N, 13)
	data.x = torch.cat([data.x, broadcast], dim=1) # (N, 282)
	return data

	data_b = add_robot_state_to_graph(data, frame_idx=42, robot_states=robot_states)
	print("Graph B node features:", data_b.x.shape) # (16, 282) for with_robot variant
	```

	## Node Feature Layout (269D)

	```
	[0 : 256] SAM2 embedding (256D) — masked avg pool over vision_features
	[256 : 259] 3D position (3D) — centroid in camera frame (meters)
	[259 : 268] type one-hot (9D) — index by type_vocab (incl. "robot")
	[268] visibility (1D) — binary flag
	```

	Total: 269D per node.

	For Graph B (with robot state broadcast):
	```
	[0 : 269] Graph A features (269D)
	[269 : 275] joint positions (6D) — UR5e joint angles (radians)
	[275 : 281] TCP pose (6D) — [x, y, z, rx, ry, rz]
	[281] gripper position (1D) — Robotiq 2F-85 (0-255)
	```

	Total: 282D per node.

	## Raw Data Access (No PyG)

	If you prefer raw NumPy without PyTorch Geometric:

	```python
	from scripts.pyg_loader import load_frame_data

	fd = load_frame_data(episode, frame_idx=42)

	print("Graph:", fd.graph["components"])
	print("Masks:", list(fd.masks.keys()))
	print("Resolved visibility:", fd.visibility)
	print("Robot present:", fd.robot is not None)

	if fd.robot is not None:
	print("Robot mask shape:", fd.robot["mask"].shape)
	print("Robot point cloud:", fd.robot["point_cloud"].shape)
	print("Robot centroid (m):", fd.robot["centroid"])

	# Access a specific component's depth info
	for key in ["point_cloud", "pixel_coords", "centroid", "area", "depth_valid"]:
	full_key = f"cpu_fan_{key}"
	if full_key in fd.depth_info:
	print(f"cpu_fan {key}: {fd.depth_info[full_key]}")
	```

	## Recording Hardware

	- Robot: UR5e + Robotiq 2F-85 gripper
	- Side camera: Luxonis OAK-D Pro (static viewpoint)
	- Intrinsics: fx=1033.8, fy=1033.7, cx=632.9, cy=359.9
	- Recording rate: 30 Hz
	- Image size: 1280 × 720
	- Depth format: uint16, millimeters
	- Teleoperation: Thrustmaster SOL-R2 HOSAS controllers

	## Annotation Tool

	Annotations created with a custom SAM2-based labeling tool:

	- Repository: https://github.com/ChangChrisLiu/gnn-world-model
	- Backend: FastAPI + SAM2 (`sam2.1_hiera_base_plus`)
	- Frontend: Vanilla HTML/JS, side-only interactive view
	- Tools: BBox, Point, Polygon, Brush, Eraser (all mask-editing operations)
	- Features: Dynamic component instances, AGENT badge for robot, scroll-to-zoom, undo/redo, per-frame delta-encoded visibility

	## License

	Released under CC BY 4.0. Use, share, and adapt freely with attribution.

	## Acknowledgements

	Built using:
	- [Segment Anything Model 2 (SAM2)](https://github.com/facebookresearch/sam2) by Meta AI
	- [PyTorch Geometric](https://pytorch-geometric.readthedocs.io/)
	- [Hugging Face Datasets](https://huggingface.co/docs/datasets)