Improve model card for AllTracker: add metadata, project page, and usage

This PR enhances the model card for the AllTracker model by:

- Adding the `pipeline_tag: image-to-image` for better discoverability and categorization on the Hub.
- Including the official project page link for comprehensive context.
- Expanding the model description based on the paper abstract and GitHub README to provide a richer overview.
- Adding clear usage instructions to enable users to quickly run the demo.
- Including the Gradio demo link and BibTeX citation for full information and proper attribution.

Files changed (1) hide show

README.md +52 -2

README.md CHANGED Viewed

@@ -2,8 +2,58 @@
 license: mit
 tags:
 - tracking
 ---
-This repository contains the model described in [AllTracker: Efficient Dense Point Tracking at High Resolution](https://huggingface.co/papers/2506.07310).
-Code: https://github.com/aharley/alltracker

 license: mit
 tags:
 - tracking
+pipeline_tag: image-to-image
 ---
+# AllTracker: Efficient Dense Point Tracking at High Resolution
+[[Paper](https://huggingface.co/papers/2506.07310)] | [[Project Page](https://alltracker.github.io/)] | [[Code](https://github.com/aharley/alltracker)] | [[Gradio Demo](https://huggingface.co/spaces/aharley/alltracker)]
+<img src='https://alltracker.github.io/images/monkey.jpg'>
+**AllTracker is a point tracking model which is faster and more accurate than other similar models, while also producing dense output at high resolution.**
+AllTracker estimates long-range point tracks by way of estimating the flow field between a query frame and every other frame of a video. Unlike existing point tracking methods, our approach delivers high-resolution and dense (all-pixel) correspondence fields, which can be visualized as flow maps. Unlike existing optical flow methods, our approach corresponds one frame to hundreds of subsequent frames, rather than just the next frame. We develop a new architecture for this task, blending techniques from existing work in optical flow and point tracking: the model performs iterative inference on low-resolution grids of correspondence estimates, propagating information spatially via 2D convolution layers, and propagating information temporally via pixel-aligned attention layers. The model is fast and parameter-efficient (16 million parameters), and delivers state-of-the-art point tracking accuracy at high resolution (i.e., tracking 768x1024 pixels, on a 40G GPU). A benefit of our design is that we can train jointly on optical flow datasets and point tracking datasets, and we find that doing so is crucial for top performance.
+## Usage (Running the Demo)
+First, set up a fresh conda environment for AllTracker:
+```bash
+conda create -n alltracker python=3.12.8
+conda activate alltracker
+pip install -r requirements.txt
+```
+Download the sample video:
+```bash
+cd demo_video
+sh download_video.sh
+cd ..
+```
+Run the demo:
+```bash
+python demo.py --mp4_path ./demo_video/monkey.mp4
+```
+The demo script will automatically download the model weights from [huggingface](https://huggingface.co/aharley/alltracker/tree/main) if needed.
+For a fancier visualization, giving a side-by-side view of the input and output, try this:
+```bash
+python demo.py --mp4_path ./demo_video/monkey.mp4 --query_frame 32 --conf_thr 0.01 --bkg_opacity 0.0 --rate 2 --hstack --query_frame 16
+```
+For more detailed information on training and evaluation, please refer to the [official GitHub repository](https://github.com/aharley/alltracker).
+## Citation
+If you use this code for your research, please cite:
+```bibtex
+@inproceedings{harley2025alltracker,
+author    = {Adam W. Harley and Yang You and Xinglong Sun and Yang Zheng and Nikhil Raghuraman and Yunqi Gu and Sheldon Liang and Wen-Hsuan Chu and Achal Dave and Pavel Tokmakov and Suya You and Rares Ambrus and Katerina Fragkiadaki and Leonidas J. Guibas},
+title     = {All{T}racker: {E}fficient Dense Point Tracking at High Resolution},
+booktitle = {ICCV},
+year      = {2025}
+}
+```