RANSAC Track Classifier
Real-time classification of RANSAC-tracked feature points in FPS gameplay as WORLD (stable scene geometry) or IGNORE (HUD, weapon, hands, overlays).
Architecture
Input: (B, N, T=16, F=10)
β
βββββββ΄βββββββ
β Temporal β Causal Dilated TCN (per-track, parallel)
β Encoder β 3 blocks, dilations [1,2,4], RF=15 frames
βββββββ¬βββββββ
β (B, N, 64)
βββββββ΄βββββββ
β Cross-Track β DeepSets Equivariant Γ 4
β Context β f(x_i) = Ο(WΒ·x_i + VΒ·mean(X))
βββββββ¬βββββββ mean(X) β consensus camera motion
β (B, N, 64)
βββββββ΄βββββββ
β Classifier β MLP β sigmoid β world_probability
β Head β
βββββββ¬βββββββ
β (B, N, 1)
57K parameters Β· 223 KB Β· <2ms on GPU for 1000 tracks
Input Features (10 per frame per track)
| # | Feature | Description |
|---|---|---|
| 0 | vx |
Velocity X (pixels/frame) |
| 1 | vy |
Velocity Y |
| 2 | speed |
βvelocityβ |
| 3 | accel |
βaccelerationβ |
| 4 | residual |
RANSAC residual error |
| 5 | inlier |
RANSAC inlier flag (0/1) |
| 6 | confidence |
Tracking confidence [0,1] |
| 7 | visibility |
Visible this frame (0/1) |
| 8 | age_norm |
Track age / 60 |
| 9 | angular_vel |
Change in motion direction |
Files
| File | Description |
|---|---|
track_classifier.py |
Full model + TCN streaming + GRU streaming |
data_generator.py |
Synthetic data (world, HUD, weapon, scope, recoil) |
train.py |
Training loop with masked BCE, class weighting |
inference.py |
Batch/streaming demos + latency benchmark |
Quick Start
from track_classifier import ModelConfig, create_model, create_streaming
import torch
# Create and load model
cfg = ModelConfig()
model = create_model(cfg, device='cuda')
checkpoint = torch.load('best_model.pt')
model.load_state_dict(checkpoint['model_state_dict'])
# Streaming inference
streamer = create_streaming(model, torch.device('cuda'), use_fp16=True)
# Per frame: push features, get classifications
for track_id, features in your_ransac_output.items():
streamer.push_frame(track_id, features)
results = streamer.classify() # {track_id: world_probability}
world_tracks = [tid for tid, p in results.items() if p > 0.7]
Training
python train.py
Trains on synthetic data. For real data, replace generate_dataset() with your annotated tracks.
Two Streaming Modes
- TCN + Buffer (recommended): Circular buffer per track, recomputes 16-frame window. Best accuracy.
- GRU + Hidden State: O(1) per frame, no buffer. Lower accuracy, minimal latency.
Hard Cases Handled
- Scope glass: Moves with camera but with different parallax
- Recoil transients: Brief disruption to all tracks
- Weapon sway/bob: Sinusoidal independent motion
- ADS transitions: Weapon occupies screen center
- HUD overlays: Zero parallax, fixed screen position
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support