aharley
/

alltracker

tracking

Model card Files Files and versions

xet

Community

Improve model card for AllTracker: add metadata, project page, and usage

by nielsr HF Staff - opened Aug 5, 2025

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

+52

-2

Files changed (1) hide show

README.md +52 -2

README.md CHANGED Viewed

@@ -2,8 +2,58 @@
 license: mit
 tags:
 - tracking
 ---
-This repository contains the model described in [AllTracker: Efficient Dense Point Tracking at High Resolution](https://huggingface.co/papers/2506.07310).
-Code: https://github.com/aharley/alltracker

 license: mit
 tags:
 - tracking
+pipeline_tag: image-to-image
 ---
+# AllTracker: Efficient Dense Point Tracking at High Resolution
+[[Paper](https://huggingface.co/papers/2506.07310)] | [[Project Page](https://alltracker.github.io/)] | [[Code](https://github.com/aharley/alltracker)] | [[Gradio Demo](https://huggingface.co/spaces/aharley/alltracker)]
+<img src='https://alltracker.github.io/images/monkey.jpg'>
+**AllTracker is a point tracking model which is faster and more accurate than other similar models, while also producing dense output at high resolution.**
+AllTracker estimates long-range point tracks by way of estimating the flow field between a query frame and every other frame of a video. Unlike existing point tracking methods, our approach delivers high-resolution and dense (all-pixel) correspondence fields, which can be visualized as flow maps. Unlike existing optical flow methods, our approach corresponds one frame to hundreds of subsequent frames, rather than just the next frame. We develop a new architecture for this task, blending techniques from existing work in optical flow and point tracking: the model performs iterative inference on low-resolution grids of correspondence estimates, propagating information spatially via 2D convolution layers, and propagating information temporally via pixel-aligned attention layers. The model is fast and parameter-efficient (16 million parameters), and delivers state-of-the-art point tracking accuracy at high resolution (i.e., tracking 768x1024 pixels, on a 40G GPU). A benefit of our design is that we can train jointly on optical flow datasets and point tracking datasets, and we find that doing so is crucial for top performance.
+## Usage (Running the Demo)
+First, set up a fresh conda environment for AllTracker:
+```bash
+conda create -n alltracker python=3.12.8
+conda activate alltracker
+pip install -r requirements.txt
+```
+Download the sample video:
+```bash
+cd demo_video
+sh download_video.sh
+cd ..
+```
+Run the demo:
+```bash
+python demo.py --mp4_path ./demo_video/monkey.mp4
+```
+The demo script will automatically download the model weights from [huggingface](https://huggingface.co/aharley/alltracker/tree/main) if needed.
+For a fancier visualization, giving a side-by-side view of the input and output, try this:
+```bash
+python demo.py --mp4_path ./demo_video/monkey.mp4 --query_frame 32 --conf_thr 0.01 --bkg_opacity 0.0 --rate 2 --hstack --query_frame 16
+```
+For more detailed information on training and evaluation, please refer to the [official GitHub repository](https://github.com/aharley/alltracker).
+## Citation
+If you use this code for your research, please cite:
+```bibtex
+@inproceedings{harley2025alltracker,
+author    = {Adam W. Harley and Yang You and Xinglong Sun and Yang Zheng and Nikhil Raghuraman and Yunqi Gu and Sheldon Liang and Wen-Hsuan Chu and Achal Dave and Pavel Tokmakov and Suya You and Rares Ambrus and Katerina Fragkiadaki and Leonidas J. Guibas},
+title     = {All{T}racker: {E}fficient Dense Point Tracking at High Resolution},
+booktitle = {ICCV},
+year      = {2025}
+}
+```