Kristen-Z
/

MegaFlow

+---
+language: en
+license: apache-2.0
+tags:
+- optical-flow
+- point-tracking
+- computer-vision
+- zero-shot
+- vit
+library_name: megaflow
+pipeline_tag: image-to-image
+---
+# MegaFlow: Zero-Shot Large Displacement Optical Flow
+**[Dingxi Zhang](https://kristen-z.github.io/)** · **[Fangjinhua Wang](https://fangjinhuawang.github.io/)** · **[Marc Pollefeys](https://people.inf.ethz.ch/marc.pollefeys/)** · **[Haofei Xu](https://haofeixu.github.io/)**
+*ETH Zurich · Microsoft · University of Tübingen, Tübingen AI Center*
+[![Project Page](https://img.shields.io/badge/Project-Page-blue?style=flat&logo=Google%20chrome&logoColor=white)](https://kristen-z.github.io/projects/megaflow/)
+[![arXiv](https://img.shields.io/badge/arXiv-Paper-b31b1b.svg?style=flat&logo=arxiv&logoColor=white)](https://arxiv.org/abs/)
+[![GitHub](https://img.shields.io/badge/GitHub-Code-black?style=flat&logo=github)](https://github.com/cvg/megaflow)
+[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/cvg/megaflow/blob/main/demo_colab.ipynb)
+---
+**MegaFlow** is a simple, powerful, and unified model for **zero-shot large displacement optical flow** and **point tracking**.
+MegaFlow leverages pre-trained Vision Transformer features to naturally capture extreme motion, followed by lightweight iterative refinement for sub-pixel accuracy. It achieves **state-of-the-art zero-shot performance** across major optical flow benchmarks (Sintel, KITTI, Spring) and delivers highly competitive zero-shot generalizability on long-range point tracking benchmarks.
+## Highlights
+- 🏆 State-of-the-art zero-shot performance on Sintel, KITTI, and Spring
+- 🎯 Designed for large displacement optical flow
+- 📹 Flexible temporal window — processes any number of frames at once
+- 🔄 Single backbone for both optical flow and long-range point tracking
+## Available Models
+| Model ID | Task | Description |
+|---|---|---|
+| `megaflow-flow` | Optical flow | Full training curriculum (default) |
+| `megaflow-chairs-things` | Optical flow | Trained on FlyingThings + FlyingChairs only |
+| `megaflow-track` | Point tracking | Fine-tuned on Kubric |
+## Quick Start
+### Installation
+```bash
+pip install git+https://github.com/cvg/megaflow.git
+```
+Requirements: Python ≥ 3.12, PyTorch ≥ 2.7, CUDA recommended.
+### Optical Flow
+```python
+import torch
+from megaflow import MegaFlow
+device = "cuda" if torch.cuda.is_available() else "cpu"
+# video: float32 tensor [1, T, 3, H, W], pixel values in [0, 255]
+video = ...
+model = MegaFlow.from_pretrained("megaflow-flow").eval().to(device)
+with torch.inference_mode():
+    with torch.autocast(device_type=device, dtype=torch.bfloat16):
+        # Returns flow for consecutive pairs: (0→1, 1→2, ...)
+        # Shape: [1, T-1, 2, H, W]
+        flow = model(video, num_reg_refine=8)["flow_preds"][-1]
+```
+### Point Tracking
+```python
+import torch
+from megaflow import MegaFlow
+from megaflow.utils.basic import gridcloud2d
+device = "cuda" if torch.cuda.is_available() else "cpu"
+# video: float32 tensor [1, T, 3, H, W], pixel values in [0, 255]
+video = ...
+model = MegaFlow.from_pretrained("megaflow-track").eval().to(device)
+with torch.inference_mode():
+    with torch.autocast(device_type=device, dtype=torch.bfloat16):
+        # Returns dense offsets from frame 0 to each frame t
+        flows_e = model.forward_track(video, num_reg_refine=8)["flow_final"]
+# Convert offsets to absolute coordinates
+grid_xy = gridcloud2d(1, H, W, norm=False, device=device).float()
+grid_xy = grid_xy.permute(0, 2, 1).reshape(1, 1, 2, H, W)
+tracks = flows_e + grid_xy  # [1, T, 2, H, W]
+```
+## Demo Scripts
+```bash
+# Clone the repo and run demos
+git clone https://github.com/cvg/megaflow.git
+cd megaflow
+# Optical flow on a video
+python demo_flow.py --input assets/longboard.mp4 --output output/longboard_flow.mp4
+# Dense point tracking
+python demo_track.py --input assets/apple.mp4 --grid_size 8
+# Gradio web UI
+python demo_gradio.py
+```
+Or try the [Colab notebook](https://colab.research.google.com/github/cvg/megaflow/blob/main/demo_colab.ipynb) directly in the browser.
+## Citation
+```
+@article{zhang2026megaflow,
+  title   = {MegaFlow: Zero-Shot Large Displacement Optical Flow},
+  author  = {Zhang, Dingxi and Wang, Fangjinhua and Pollefeys, Marc and Xu, Haofei},
+  journal = {arXiv preprint arXiv:2603.25739},
+  year    = {2026}
+}
+```