Spaces:

azan888
/

3d_model

Sleeping

App Files Files Community

3d_model / docs /ORACLE_ENSEMBLE.md

Azan

Clean deployment build (Squashed)

7a87926 11 days ago

preview code

raw

history blame contribute delete

9.29 kB

	# Oracle Ensemble: Multi-Source Validation and Rejection

	## 🎯 Overview

	The Oracle Ensemble system uses all available oracle sources (ARKit poses, BA poses, LiDAR depth, IMU data) to create high-confidence training masks by rejecting DA3 predictions where oracles disagree. This enables training only on pixels/points where multiple independent sources agree, resulting in higher-quality supervision.

	## 🔍 Core Concept

	Instead of choosing one oracle source, we use all of them together:

	```
	For each DA3 prediction:
	├─ Compare with ARKit poses (VIO)
	├─ Compare with BA poses (multi-view geometry)
	├─ Compare with LiDAR depth (direct ToF)
	├─ Check geometric consistency (reprojection error)
	└─ Check IMU consistency (motion matches sensors)

	→ Create confidence mask: Only train on pixels where oracles agree
	```

	## 📊 Oracle Sources and Accuracy

	### 1. ARKit Poses (VIO)

	- Accuracy: <1° rotation, <5cm translation (when tracking is good)
	- Coverage: Frame-level (all pixels in frame)
	- Trust Level: High (0.8) when tracking is "normal"
	- Limitations: Drift over long sequences, poor when tracking fails

	### 2. BA Poses (Multi-View Geometry)

	- Accuracy: <0.5° rotation, <2cm translation (after optimization)
	- Coverage: Frame-level (all pixels in frame)
	- Trust Level: Highest (0.9) - most robust
	- Limitations: Requires good feature matching, slower computation

	### 3. LiDAR Depth (Time-of-Flight)

	- Accuracy: ±1-2cm absolute error
	- Coverage: Pixel-level (sparse, ~10-30% of pixels)
	- Trust Level: Very High (0.95) - direct measurement
	- Limitations: Sparse coverage, only available on LiDAR-enabled devices

	### 4. Geometric Consistency

	- Accuracy: <2 pixels reprojection error
	- Coverage: Pixel-level (all pixels)
	- Trust Level: High (0.85) - enforces epipolar geometry
	- Limitations: Requires good depth predictions

	### 5. IMU Data (Motion Sensors)

	- Accuracy: Velocity ±0.5 m/s, angular velocity ±0.1 rad/s
	- Coverage: Frame-level (motion between frames)
	- Trust Level: Medium (0.7) - indirect but useful
	- Limitations: Requires integration, may not be in ARKit metadata

	## 🎚️ Confidence Mask Generation

	### Agreement Scoring

	For each pixel/frame, compute agreement score:

	```python
	agreement_score = weighted_sum(oracle_votes) / total_weight

	where:
	- oracle_votes: 1 if oracle agrees, 0 if disagrees
	- weights: Trust level of each oracle (0.7-0.95)
	```

	### Rejection Strategy

	Per-Pixel Rejection:

	- Reject pixels where `agreement_score < min_agreement_ratio` (default: 0.7)
	- Only train on pixels where ≥70% of oracles agree

	Per-Frame Rejection:

	- Reject entire frames if pose agreement is too low
	- Useful for sequences with tracking failures

	### Confidence Mask

	```python
	confidence_mask = {
	'pose_confidence': (N,) frame-level scores [0.0-1.0]
	'depth_confidence': (N, H, W) pixel-level scores [0.0-1.0]
	'rejection_mask': (N, H, W) bool - pixels to reject
	'agreement_scores': (N, H, W) fraction of oracles that agree
	}
	```

	## 🚀 Usage

	### Basic Usage

	```python
	from ylff.utils.oracle_ensemble import OracleEnsemble

	# Initialize ensemble
	ensemble = OracleEnsemble(
	pose_rotation_threshold=2.0, # degrees
	pose_translation_threshold=0.05, # meters
	depth_relative_threshold=0.1, # 10% relative error
	min_agreement_ratio=0.7, # Require 70% agreement
	)

	# Validate DA3 predictions
	results = ensemble.validate_da3_predictions(
	da3_poses=da3_poses, # (N, 3, 4) w2c
	da3_depth=da3_depth, # (N, H, W)
	intrinsics=intrinsics, # (N, 3, 3)
	arkit_poses=arkit_poses_c2w, # (N, 4, 4) c2w
	ba_poses=ba_poses_w2c, # (N, 3, 4) w2c
	lidar_depth=lidar_depth, # (N, H, W) optional
	)

	# Get confidence masks
	confidence_mask = results['confidence_mask'] # (N, H, W)
	rejection_mask = results['rejection_mask'] # (N, H, W) bool
	```

	### Training with Oracle Ensemble

	```python
	from ylff.utils.oracle_losses import oracle_ensemble_loss

	# Compute loss with confidence weighting
	loss_dict = oracle_ensemble_loss(
	da3_output={
	'poses': predicted_poses, # (N, 3, 4)
	'depth': predicted_depth, # (N, H, W)
	},
	oracle_targets={
	'poses': target_poses, # (N, 3, 4)
	'depth': target_depth, # (N, H, W)
	},
	confidence_masks={
	'pose_confidence': frame_confidence, # (N,)
	'depth_confidence': pixel_confidence, # (N, H, W)
	},
	min_confidence=0.7, # Only train on high-confidence pixels
	)

	total_loss = loss_dict['total_loss']
	```

	## 📈 Expected Results

	### Training Quality

	With Oracle Ensemble:

	- ✅ Only trains on pixels where multiple oracles agree
	- ✅ Rejects noisy/incorrect DA3 predictions
	- ✅ Higher-quality supervision signal
	- ✅ Better generalization

	Typical Rejection Rates:

	- 20-40% of pixels rejected (oracles disagree)
	- 5-15% of frames rejected (poor pose agreement)
	- Higher rejection in challenging scenes (low texture, motion blur)

	### Performance Impact

	Processing Time:

	- Oracle validation: +10-20% overhead
	- Training: Faster convergence (better supervision)
	- Overall: Net positive (better quality > slight overhead)

	## ⚙️ Configuration

	### Thresholds

	```python
	ensemble = OracleEnsemble(
	# Pose agreement
	pose_rotation_threshold=2.0, # degrees - stricter = more rejections
	pose_translation_threshold=0.05, # meters (5cm)

	# Depth agreement
	depth_relative_threshold=0.1, # 10% relative error
	depth_absolute_threshold=0.1, # 10cm absolute error

	# Geometric consistency
	reprojection_error_threshold=2.0, # pixels

	# IMU consistency
	imu_velocity_threshold=0.5, # m/s
	imu_angular_velocity_threshold=0.1, # rad/s

	# Minimum agreement
	min_agreement_ratio=0.7, # Require 70% of oracles to agree
	)
	```

	### Oracle Weights

	Customize trust levels:

	```python
	ensemble = OracleEnsemble(
	oracle_weights={
	'arkit_pose': 0.8, # High trust when tracking is good
	'ba_pose': 0.9, # Highest trust
	'lidar_depth': 0.95, # Very high trust (direct measurement)
	'imu': 0.7, # Medium trust
	'geometric_consistency': 0.85, # High trust
	}
	)
	```

	## 🔬 Advanced Usage

	### Per-Oracle Analysis

	```python
	results = ensemble.validate_da3_predictions(...)

	# Individual oracle votes
	oracle_votes = results['oracle_votes']
	arkit_agreement = oracle_votes['arkit_pose'] # (N, 1, 1)
	ba_agreement = oracle_votes['ba_pose'] # (N, 1, 1)
	lidar_agreement = oracle_votes['lidar_depth'] # (N, H, W)

	# Error metrics
	rotation_errors = results['rotation_errors'] # (N, 2) [arkit, ba]
	translation_errors = results['translation_errors'] # (N, 2)
	depth_relative_errors = results['relative_errors'] # (N, H, W)
	```

	### Adaptive Thresholds

	Adjust thresholds based on scene difficulty:

	```python
	# Easy scene (good tracking, high texture)
	ensemble_easy = OracleEnsemble(
	pose_rotation_threshold=1.0, # Stricter
	min_agreement_ratio=0.8, # Require more agreement
	)

	# Hard scene (poor tracking, low texture)
	ensemble_hard = OracleEnsemble(
	pose_rotation_threshold=3.0, # More lenient
	min_agreement_ratio=0.6, # Require less agreement
	)
	```

	## 💡 Best Practices

	### 1. Start Conservative

	Begin with strict thresholds, then relax if needed:

	```python
	min_agreement_ratio=0.8 # Start high
	pose_rotation_threshold=1.0 # Stricter
	```

	### 2. Monitor Rejection Rates

	Track how many pixels/frames are rejected:

	```python
	rejection_rate = rejection_mask.sum() / rejection_mask.numel()
	logger.info(f"Rejection rate: {rejection_rate:.1%}")
	```

	### 3. Use All Available Oracles

	Don't skip oracles - more sources = better validation:

	```python
	# Always include all available sources
	results = ensemble.validate_da3_predictions(
	da3_poses=...,
	da3_depth=...,
	arkit_poses=arkit_poses, # Include if available
	ba_poses=ba_poses, # Include if available
	lidar_depth=lidar_depth, # Include if available
	)
	```

	### 4. Visualize Confidence Masks

	```python
	import matplotlib.pyplot as plt

	# Visualize confidence
	plt.imshow(confidence_mask[0], cmap='hot')
	plt.colorbar(label='Confidence')
	plt.title('Oracle Agreement Confidence')
	```

	## 🎓 Why This Works

	Multiple Independent Sources:

	- Each oracle has different failure modes
	- Agreement across multiple sources = high confidence
	- Disagreement = likely error in DA3 prediction

	Confidence-Weighted Training:

	- Train more on high-confidence pixels
	- Reject low-confidence pixels
	- Better supervision signal = better model

	Robust to Oracle Failures:

	- If one oracle fails, others can still validate
	- Weighted voting reduces impact of single failures
	- Minimum agreement ratio ensures consensus

	## 📊 Statistics

	After processing, you'll see:

	```
	Oracle Ensemble Validation:
	- ARKit pose agreement: 85.2% of frames
	- BA pose agreement: 92.1% of frames
	- LiDAR depth agreement: 78.5% of pixels (where available)
	- Geometric consistency: 91.3% of pixels
	- Overall confidence: 0.87 (mean)
	- Rejection rate: 23.1% of pixels
	```

	This system enables high-quality training by only using pixels where multiple independent sources agree! 🚀