Improve model card: Add tags, paper/project links, and installation instructions

This PR improves the model card by:

- Adding `pipeline_tag: robotics` for better discoverability.
- Specifying `library_name: lerobot` as the model is built with it.
- Adding `license: apache-2.0`.
- Populating the content with a project overview, paper link, project page link, and GitHub repository link.
- Including basic installation instructions.

Files changed (1) hide show

README.md +68 -4

README.md CHANGED Viewed

@@ -2,9 +2,73 @@
 tags:
 - model_hub_mixin
 - pytorch_model_hub_mixin
 ---
-This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
-- Code: [More Information Needed]
-- Paper: [More Information Needed]
-- Docs: [More Information Needed]

 tags:
 - model_hub_mixin
 - pytorch_model_hub_mixin
+pipeline_tag: robotics
+library_name: lerobot
+license: apache-2.0
 ---
+# Look, Focus, Act: Efficient and Robust Robot Learning via Human Gaze and Foveated Vision Transformers
+This repository contains a pretrained model, part of the **Look, Focus, Act: Efficient and Robust Robot Learning via Human Gaze and Foveated Vision Transformers** project.
+**Paper:** [Look, Focus, Act: Efficient and Robust Robot Learning via Human Gaze and Foveated Vision Transformers](https://huggingface.co/papers/2507.15833)
+**Project Website:** [https://ian-chuang.github.io/gaze-av-aloha/](https://ian-chuang.github.io/gaze-av-aloha/)
+**Code:** [https://github.com/ian-chuang/gaze-av-aloha](https://github.com/ian-chuang/gaze-av-aloha)
+## Overview
+This repository contains the official code for the paper:
+**"Look, Focus, Act: Efficient and Robust Robot Learning via Human Gaze and Foveated Vision Transformers"**
+We propose a human-inspired *foveated vision framework* for robot learning that combines human gaze, foveated ViTs, and robotic control to enable policies that are both efficient and robust. Our approach reduces ViT computation by 94%, accelerating training by 7× and inference by 3×.
+## Installation
+Follow the steps below to set up the environment and install all necessary dependencies.
+```bash
+# Clone the repository and initialize submodules
+git clone https://github.com/ian-chuang/gaze-av-aloha.git
+cd gaze-av-aloha
+git submodule init
+git submodule update
+# Create and activate a new Conda environment
+conda create -n gaze python=3.10
+conda activate gaze
+# Install LeRobot
+pip install git+https://github.com/huggingface/lerobot.git@483be9aac217c2d8ef16982490f22b2ad091ab46
+# Install FFmpeg for video logging
+conda install ffmpeg=7.1.1 -c conda-forge
+# Install AV-ALOHA packages
+pip install -e ./gym_av_aloha
+pip install -e ./gaze_av_aloha
+```
+### Authentication
+Make sure you're logged in to both Weights & Biases and Hugging Face:
+```bash
+wandb login
+huggingface-cli login
+```
+## Citation
+If you find our work helpful or inspiring, please feel free to cite it:
+```bibtex
+@misc{chuang2025lookfocusactefficient,
+      title={Look, Focus, Act: Efficient and Robust Robot Learning via Human Gaze and Foveated Vision Transformers},
+      author={Ian Chuang and Andrew Lee and Dechen Gao and Jinyu Zou and Iman Soltani},
+      year={2025},
+      eprint={2507.15833},
+      archivePrefix={arXiv},
+      primaryClass={cs.RO},
+      url={https://arxiv.org/abs/2507.15833},
+}
+```