File size: 4,419 Bytes
b4c5c33 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 | ---
language: en
license: apache-2.0
tags:
- optical-flow
- point-tracking
- computer-vision
- zero-shot
- vit
library_name: megaflow
pipeline_tag: image-to-image
---
# MegaFlow: Zero-Shot Large Displacement Optical Flow
**[Dingxi Zhang](https://kristen-z.github.io/)** · **[Fangjinhua Wang](https://fangjinhuawang.github.io/)** · **[Marc Pollefeys](https://people.inf.ethz.ch/marc.pollefeys/)** · **[Haofei Xu](https://haofeixu.github.io/)**
*ETH Zurich · Microsoft · University of Tübingen, Tübingen AI Center*
[](https://kristen-z.github.io/projects/megaflow/)
[](https://arxiv.org/abs/)
[](https://github.com/cvg/megaflow)
[](https://colab.research.google.com/github/cvg/megaflow/blob/main/demo_colab.ipynb)
---
**MegaFlow** is a simple, powerful, and unified model for **zero-shot large displacement optical flow** and **point tracking**.
MegaFlow leverages pre-trained Vision Transformer features to naturally capture extreme motion, followed by lightweight iterative refinement for sub-pixel accuracy. It achieves **state-of-the-art zero-shot performance** across major optical flow benchmarks (Sintel, KITTI, Spring) and delivers highly competitive zero-shot generalizability on long-range point tracking benchmarks.
## Highlights
- 🏆 State-of-the-art zero-shot performance on Sintel, KITTI, and Spring
- 🎯 Designed for large displacement optical flow
- 📹 Flexible temporal window — processes any number of frames at once
- 🔄 Single backbone for both optical flow and long-range point tracking
## Available Models
| Model ID | Task | Description |
|---|---|---|
| `megaflow-flow` | Optical flow | Full training curriculum (default) |
| `megaflow-chairs-things` | Optical flow | Trained on FlyingThings + FlyingChairs only |
| `megaflow-track` | Point tracking | Fine-tuned on Kubric |
## Quick Start
### Installation
```bash
pip install git+https://github.com/cvg/megaflow.git
```
Requirements: Python ≥ 3.12, PyTorch ≥ 2.7, CUDA recommended.
### Optical Flow
```python
import torch
from megaflow import MegaFlow
device = "cuda" if torch.cuda.is_available() else "cpu"
# video: float32 tensor [1, T, 3, H, W], pixel values in [0, 255]
video = ...
model = MegaFlow.from_pretrained("megaflow-flow").eval().to(device)
with torch.inference_mode():
with torch.autocast(device_type=device, dtype=torch.bfloat16):
# Returns flow for consecutive pairs: (0→1, 1→2, ...)
# Shape: [1, T-1, 2, H, W]
flow = model(video, num_reg_refine=8)["flow_preds"][-1]
```
### Point Tracking
```python
import torch
from megaflow import MegaFlow
from megaflow.utils.basic import gridcloud2d
device = "cuda" if torch.cuda.is_available() else "cpu"
# video: float32 tensor [1, T, 3, H, W], pixel values in [0, 255]
video = ...
model = MegaFlow.from_pretrained("megaflow-track").eval().to(device)
with torch.inference_mode():
with torch.autocast(device_type=device, dtype=torch.bfloat16):
# Returns dense offsets from frame 0 to each frame t
flows_e = model.forward_track(video, num_reg_refine=8)["flow_final"]
# Convert offsets to absolute coordinates
grid_xy = gridcloud2d(1, H, W, norm=False, device=device).float()
grid_xy = grid_xy.permute(0, 2, 1).reshape(1, 1, 2, H, W)
tracks = flows_e + grid_xy # [1, T, 2, H, W]
```
## Demo Scripts
```bash
# Clone the repo and run demos
git clone https://github.com/cvg/megaflow.git
cd megaflow
# Optical flow on a video
python demo_flow.py --input assets/longboard.mp4 --output output/longboard_flow.mp4
# Dense point tracking
python demo_track.py --input assets/apple.mp4 --grid_size 8
# Gradio web UI
python demo_gradio.py
```
Or try the [Colab notebook](https://colab.research.google.com/github/cvg/megaflow/blob/main/demo_colab.ipynb) directly in the browser.
## Citation
```
@article{zhang2026megaflow,
title = {MegaFlow: Zero-Shot Large Displacement Optical Flow},
author = {Zhang, Dingxi and Wang, Fangjinhua and Pollefeys, Marc and Xu, Haofei},
journal = {arXiv preprint arXiv:2603.25739},
year = {2026}
}
``` |