RANSAC Track Classifier

Real-time classification of RANSAC-tracked feature points in FPS gameplay as WORLD (stable scene geometry) or IGNORE (HUD, weapon, hands, overlays).

Architecture

Input: (B, N, T=16, F=10)
         │
   ┌─────┴──────┐
   │  Temporal   │  Causal Dilated TCN (per-track, parallel)
   │  Encoder    │  3 blocks, dilations [1,2,4], RF=15 frames
   └─────┬──────┘
         │  (B, N, 64)
   ┌─────┴──────┐
   │ Cross-Track │  DeepSets Equivariant × 4
   │  Context    │  f(x_i) = σ(W·x_i + V·mean(X))
   └─────┬──────┘  mean(X) ≈ consensus camera motion
         │  (B, N, 64)
   ┌─────┴──────┐
   │ Classifier  │  MLP → sigmoid → world_probability
   │    Head     │
   └─────┬──────┘
         │  (B, N, 1)

57K parameters · 223 KB · <2ms on GPU for 1000 tracks

Input Features (10 per frame per track)

#	Feature	Description
0	`vx`	Velocity X (pixels/frame)
1	`vy`	Velocity Y
2	`speed`	‖velocity‖
3	`accel`	‖acceleration‖
4	`residual`	RANSAC residual error
5	`inlier`	RANSAC inlier flag (0/1)
6	`confidence`	Tracking confidence [0,1]
7	`visibility`	Visible this frame (0/1)
8	`age_norm`	Track age / 60
9	`angular_vel`	Change in motion direction

Files

File	Description
`track_classifier.py`	Full model + TCN streaming + GRU streaming
`data_generator.py`	Synthetic data (world, HUD, weapon, scope, recoil)
`train.py`	Training loop with masked BCE, class weighting
`inference.py`	Batch/streaming demos + latency benchmark

Quick Start

from track_classifier import ModelConfig, create_model, create_streaming
import torch

# Create and load model
cfg = ModelConfig()
model = create_model(cfg, device='cuda')
checkpoint = torch.load('best_model.pt')
model.load_state_dict(checkpoint['model_state_dict'])

# Streaming inference
streamer = create_streaming(model, torch.device('cuda'), use_fp16=True)

# Per frame: push features, get classifications
for track_id, features in your_ransac_output.items():
    streamer.push_frame(track_id, features)

results = streamer.classify()  # {track_id: world_probability}
world_tracks = [tid for tid, p in results.items() if p > 0.7]

Training

python train.py

Trains on synthetic data. For real data, replace generate_dataset() with your annotated tracks.

Two Streaming Modes

TCN + Buffer (recommended): Circular buffer per track, recomputes 16-frame window. Best accuracy.
GRU + Hidden State: O(1) per frame, no buffer. Lower accuracy, minimal latency.

Hard Cases Handled

Scope glass: Moves with camera but with different parallax
Recoil transients: Brief disruption to all tracks
Weapon sway/bob: Sinusoidal independent motion
ADS transitions: Weapon occupies screen center
HUD overlays: Zero parallax, fixed screen position

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support