| | --- |
| | license: apache-2.0 |
| | tags: |
| | - vision |
| | - tracking |
| | --- |
| | |
| | # TAPNet |
| |
|
| | This repository contains the checkpoints of several point tracking models developed by DeepMind for point tracking. |
| |
|
| | π **Code**: [https://github.com/google-deepmind/tapnet](https://github.com/google-deepmind/tapnet) |
| |
|
| | ## Included Models |
| |
|
| | [**TAPIR**](https://deepmind-tapir.github.io/) β A fast and accurate point tracker for continuous point trajectories in space-time, presented in the paper [**TAPIR: Tracking Any Point with per-frame Initialization and temporal Refinement**](https://huggingface.co/papers/2306.08637). |
| |
|
| | [**BootsTAPIR**](https://bootstap.github.io/) β A bootstrapped variant of TAPIR that improves robustness and stability across long videos via self-supervised refinement, presented in the paper [**BootsTAP: Bootstrapped Training for Tracking-Any-Point**](https://huggingface.co/papers/2402.00847). |
| | |
| | [**TAPNext**](https://tap-next.github.io/) β A new generative approach that frames point tracking as next-token prediction, enabling semi-dense, accurate, and temporally coherent tracking across challenging videos, presented in the paper [**TAPNext: Tracking Any Point (TAP) as Next Token Prediction**](https://huggingface.co/papers/2504.05579). |
| |
|
| | These models provide state-of-the-art performance for tracking arbitrary points in videos and support research and applications in robotics, perception, and video generation. |
| |
|