TeleopWM

Aws Khalil, Jaerock Kwon
Bio-Inspired Machine Intelligence (BIMI) Lab
University of Michigan–Dearborn

TeleopWM is a lightweight predictive latent world model for latency-resilient vision-based teleoperation. Given recent RGB observations and teleoperation control history, it predicts short-horizon future visual observations and future longitudinal/steering trends for predictive display.

TeleopWM is designed for short-horizon predictive display and future action forecasting under teleoperation latency while maintaining lightweight real-time inference characteristics.

Paper and Project Links

Model Description

TeleopWM predicts 8 future RGB frames and future longitudinal/steering trends from recent visual observations and teleoperation control history. The model uses a SimVP visual backbone together with a TeleopWM latent dynamics branch, and is designed for real-time predictive display under teleoperation latency.

The checkpoint was trained and evaluated on CARLA/MILE-style driving rollouts. TeleopWM is intended as a compact research model for short-horizon predictive continuity, not as an open-ended video generation or autonomous-driving foundation model.

Architecture

TeleopWM combines a SimVP visual backbone with a lightweight latent dynamics module and a motion-aware future action prediction head. The model jointly predicts future visual observations and future driving actions within a unified predictive framework designed for latency-resilient teleoperation.

TeleopWM Method

Intended Use

  • Research on latency-resilient vision-based teleoperation
  • Predictive display under communication latency
  • Short-horizon future observation prediction
  • Future action trend prediction
  • CARLA/MILE-style driving rollout analysis

Out-of-Scope Use

  • Safety-critical autonomous driving deployment without validation
  • Open-ended video generation
  • Direct real-vehicle deployment without additional testing
  • General-purpose world modeling outside the evaluated driving domain

Files

  • best.pt — final TeleopWM paper checkpoint
  • config.json — training/evaluation configuration associated with the checkpoint
  • benchmark.json — runtime benchmark summary
  • future_action_eval.png — future action evaluation figure
  • main_rollout_action_figure_final.png — qualitative rollout/action alignment figure

Results Summary

Category Metric Value
Rollout prediction Horizon 8 frames / approximately 533 ms at 15 FPS
Future action prediction Outputs longitudinal and steering trends
Runtime Inference latency 38.9 ms / rollout
Runtime Prediction rate 205.5 FPS
Runtime Peak VRAM 1.24 GB
Resolution Input/output 320x512

Runtime values are reference measurements from the final paper configuration and should be re-measured on target hardware.

Qualitative Rollout Example

TeleopWM qualitative rollout results

Representative 8-step future RGB rollouts and action alignment across straight, mild-turn, sharp-turn, and intersection scenarios.

Future Action Prediction

TeleopWM future action evaluation

Per-step future action error and correlation for longitudinal and steering predictions.

Usage

Download the checkpoint and config:

huggingface-cli download bimilab/TeleopWM \
  best.pt config.json \
  --local-dir checkpoints/TeleopWM

Then evaluate using the TeleopWM repository:

python scripts/evaluate_teleopwm.py \
  --checkpoint checkpoints/TeleopWM/best.pt \
  --data-root /path/to/mile_action_diverse/test/Town05 \
  --split test \
  --sample-strategy uniform \
  --max-samples 64 \
  --device cuda

Runtime benchmarking:

python scripts/benchmark_teleopwm.py \
  --checkpoint checkpoints/TeleopWM/best.pt \
  --device cuda \
  --batch-size 1 \
  --warmup 20 \
  --iters 200

Citation

If you use TeleopWM, please cite:

@misc{teleopwm2026,
  title={TeleopWM: A Real-Time Predictive World Model for Latency-Resilient Vision-Based Teleoperation},
  author={Khalil, Aws and Kwon, Jaerock},
  year={2026},
  note={Under review}
}

License

This model is released under the MIT License.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train bimilab/TeleopWM