| --- |
| language: en |
| license: apache-2.0 |
| tags: |
| - optical-flow |
| - point-tracking |
| - computer-vision |
| - zero-shot |
| - vit |
| library_name: megaflow |
| pipeline_tag: image-to-image |
| --- |
| |
| # MegaFlow: Zero-Shot Large Displacement Optical Flow |
|
|
| **[Dingxi Zhang](https://kristen-z.github.io/)** · **[Fangjinhua Wang](https://fangjinhuawang.github.io/)** · **[Marc Pollefeys](https://people.inf.ethz.ch/marc.pollefeys/)** · **[Haofei Xu](https://haofeixu.github.io/)** |
|
|
| *ETH Zurich · Microsoft · University of Tübingen, Tübingen AI Center* |
|
|
| [](https://kristen-z.github.io/projects/megaflow/) |
| [](https://arxiv.org/abs/) |
| [](https://github.com/cvg/megaflow) |
| [](https://colab.research.google.com/github/cvg/megaflow/blob/main/demo_colab.ipynb) |
|
|
| --- |
|
|
| **MegaFlow** is a simple, powerful, and unified model for **zero-shot large displacement optical flow** and **point tracking**. |
|
|
| MegaFlow leverages pre-trained Vision Transformer features to naturally capture extreme motion, followed by lightweight iterative refinement for sub-pixel accuracy. It achieves **state-of-the-art zero-shot performance** across major optical flow benchmarks (Sintel, KITTI, Spring) and delivers highly competitive zero-shot generalizability on long-range point tracking benchmarks. |
|
|
| ## Highlights |
|
|
| - 🏆 State-of-the-art zero-shot performance on Sintel, KITTI, and Spring |
| - 🎯 Designed for large displacement optical flow |
| - 📹 Flexible temporal window — processes any number of frames at once |
| - 🔄 Single backbone for both optical flow and long-range point tracking |
|
|
| ## Available Models |
|
|
| | Model ID | Task | Description | |
| |---|---|---| |
| | `megaflow-flow` | Optical flow | Full training curriculum (default) | |
| | `megaflow-chairs-things` | Optical flow | Trained on FlyingThings + FlyingChairs only | |
| | `megaflow-track` | Point tracking | Fine-tuned on Kubric | |
|
|
| ## Quick Start |
|
|
| ### Installation |
|
|
| ```bash |
| pip install git+https://github.com/cvg/megaflow.git |
| |
| ``` |
| Requirements: Python ≥ 3.12, PyTorch ≥ 2.7, CUDA recommended. |
|
|
| ### Optical Flow |
| ```python |
| import torch |
| from megaflow import MegaFlow |
| |
| device = "cuda" if torch.cuda.is_available() else "cpu" |
| |
| # video: float32 tensor [1, T, 3, H, W], pixel values in [0, 255] |
| video = ... |
| |
| model = MegaFlow.from_pretrained("megaflow-flow").eval().to(device) |
| |
| with torch.inference_mode(): |
| with torch.autocast(device_type=device, dtype=torch.bfloat16): |
| # Returns flow for consecutive pairs: (0→1, 1→2, ...) |
| # Shape: [1, T-1, 2, H, W] |
| flow = model(video, num_reg_refine=8)["flow_preds"][-1] |
| ``` |
|
|
| ### Point Tracking |
| ```python |
| import torch |
| from megaflow import MegaFlow |
| from megaflow.utils.basic import gridcloud2d |
| |
| device = "cuda" if torch.cuda.is_available() else "cpu" |
| |
| # video: float32 tensor [1, T, 3, H, W], pixel values in [0, 255] |
| video = ... |
| |
| model = MegaFlow.from_pretrained("megaflow-track").eval().to(device) |
| |
| with torch.inference_mode(): |
| with torch.autocast(device_type=device, dtype=torch.bfloat16): |
| # Returns dense offsets from frame 0 to each frame t |
| flows_e = model.forward_track(video, num_reg_refine=8)["flow_final"] |
| |
| # Convert offsets to absolute coordinates |
| grid_xy = gridcloud2d(1, H, W, norm=False, device=device).float() |
| grid_xy = grid_xy.permute(0, 2, 1).reshape(1, 1, 2, H, W) |
| tracks = flows_e + grid_xy # [1, T, 2, H, W] |
| ``` |
| ## Demo Scripts |
| ```bash |
| # Clone the repo and run demos |
| git clone https://github.com/cvg/megaflow.git |
| cd megaflow |
| |
| # Optical flow on a video |
| python demo_flow.py --input assets/longboard.mp4 --output output/longboard_flow.mp4 |
| |
| # Dense point tracking |
| python demo_track.py --input assets/apple.mp4 --grid_size 8 |
| |
| # Gradio web UI |
| python demo_gradio.py |
| ``` |
| Or try the [Colab notebook](https://colab.research.google.com/github/cvg/megaflow/blob/main/demo_colab.ipynb) directly in the browser. |
|
|
| ## Citation |
| ``` |
| @article{zhang2026megaflow, |
| title = {MegaFlow: Zero-Shot Large Displacement Optical Flow}, |
| author = {Zhang, Dingxi and Wang, Fangjinhua and Pollefeys, Marc and Xu, Haofei}, |
| journal = {arXiv preprint arXiv:2603.25739}, |
| year = {2026} |
| } |
| ``` |