bfshi
/

VideoMAE_AutoGaze

Model card Files Files and versions

xet

Community

Add pipeline tag and links to paper/code

by nielsr HF Staff - opened Mar 26

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+20

-1

Files changed (1) hide show

README.md +20 -1

README.md CHANGED Viewed

@@ -2,15 +2,20 @@
 license: other
 license_name: nvidia
 license_link: LICENSE
 ---
 ## Model Overview
 ### Description:
-VideoMAE model used for training AutoGaze. This model is for research and development only.  <br>
 ### License/Terms of Use:
@@ -124,3 +129,17 @@ The raw videos are collected from public dataset including Ego4D, 100DoH, Intern
 NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications.  When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. Please report model quality, risk, security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/). <br>
 Please make sure you have proper rights and permissions for all input image and video content; if image or video includes people, personal health information, or intellectual property, the image or video generated will not blur or maintain proportions of image subjects included. <br>

 license: other
 license_name: nvidia
 license_link: LICENSE
+pipeline_tag: video-classification
 ---
+# VideoMAE_AutoGaze
+[**Attend Before Attention: Efficient and Scalable Video Understanding via Autoregressive Gazing**](https://huggingface.co/papers/2603.12254)
+[**Project Page**](https://autogaze.github.io/) | [**GitHub**](https://github.com/NVlabs/AutoGaze) | [**Demo**](https://huggingface.co/spaces/bfshi/AutoGaze)
 ## Model Overview
 ### Description:
+VideoMAE model used for training AutoGaze. This model is for research and development only. <br>
 ### License/Terms of Use:
 NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications.  When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. Please report model quality, risk, security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/). <br>
 Please make sure you have proper rights and permissions for all input image and video content; if image or video includes people, personal health information, or intellectual property, the image or video generated will not blur or maintain proportions of image subjects included. <br>
+## Citation
+```bibtex
+@misc{shi2026attendattentionefficientscalable,
+      title={Attend Before Attention: Efficient and Scalable Video Understanding via Autoregressive Gazing},
+      author={Baifeng Shi and Stephanie Fu and Long Lian and Hanrong Ye and David Eigen and Aaron Reite and Boyi Li and Jan Kautz and Song Han and David M. Chan and Pavlo Molchanov and Trevor Darrell and Hongxu Yin},
+      year={2026},
+      eprint={2603.12254},
+      archivePrefix={arXiv},
+      primaryClass={cs.CV},
+      url={https://arxiv.org/abs/2603.12254},
+}
+```