nielsr HF Staff commited on
Commit
dd8e396
·
verified ·
1 Parent(s): e814b41

Improve model card: Add tags, paper/project links, and installation instructions

Browse files

This PR improves the model card by:

- Adding `pipeline_tag: robotics` for better discoverability.
- Specifying `library_name: lerobot` as the model is built with it.
- Adding `license: apache-2.0`.
- Populating the content with a project overview, paper link, project page link, and GitHub repository link.
- Including basic installation instructions.

Files changed (1) hide show
  1. README.md +68 -4
README.md CHANGED
@@ -2,9 +2,73 @@
2
  tags:
3
  - model_hub_mixin
4
  - pytorch_model_hub_mixin
 
 
 
5
  ---
6
 
7
- This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
8
- - Code: [More Information Needed]
9
- - Paper: [More Information Needed]
10
- - Docs: [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  tags:
3
  - model_hub_mixin
4
  - pytorch_model_hub_mixin
5
+ pipeline_tag: robotics
6
+ library_name: lerobot
7
+ license: apache-2.0
8
  ---
9
 
10
+ # Look, Focus, Act: Efficient and Robust Robot Learning via Human Gaze and Foveated Vision Transformers
11
+
12
+ This repository contains a pretrained model, part of the **Look, Focus, Act: Efficient and Robust Robot Learning via Human Gaze and Foveated Vision Transformers** project.
13
+
14
+ **Paper:** [Look, Focus, Act: Efficient and Robust Robot Learning via Human Gaze and Foveated Vision Transformers](https://huggingface.co/papers/2507.15833)
15
+ **Project Website:** [https://ian-chuang.github.io/gaze-av-aloha/](https://ian-chuang.github.io/gaze-av-aloha/)
16
+ **Code:** [https://github.com/ian-chuang/gaze-av-aloha](https://github.com/ian-chuang/gaze-av-aloha)
17
+
18
+ ## Overview
19
+
20
+ This repository contains the official code for the paper:
21
+ **"Look, Focus, Act: Efficient and Robust Robot Learning via Human Gaze and Foveated Vision Transformers"**
22
+
23
+ We propose a human-inspired *foveated vision framework* for robot learning that combines human gaze, foveated ViTs, and robotic control to enable policies that are both efficient and robust. Our approach reduces ViT computation by 94%, accelerating training by 7× and inference by 3×.
24
+
25
+ ## Installation
26
+
27
+ Follow the steps below to set up the environment and install all necessary dependencies.
28
+
29
+ ```bash
30
+ # Clone the repository and initialize submodules
31
+ git clone https://github.com/ian-chuang/gaze-av-aloha.git
32
+ cd gaze-av-aloha
33
+ git submodule init
34
+ git submodule update
35
+
36
+ # Create and activate a new Conda environment
37
+ conda create -n gaze python=3.10
38
+ conda activate gaze
39
+
40
+ # Install LeRobot
41
+ pip install git+https://github.com/huggingface/lerobot.git@483be9aac217c2d8ef16982490f22b2ad091ab46
42
+
43
+ # Install FFmpeg for video logging
44
+ conda install ffmpeg=7.1.1 -c conda-forge
45
+
46
+ # Install AV-ALOHA packages
47
+ pip install -e ./gym_av_aloha
48
+ pip install -e ./gaze_av_aloha
49
+ ```
50
+
51
+ ### Authentication
52
+
53
+ Make sure you're logged in to both Weights & Biases and Hugging Face:
54
+
55
+ ```bash
56
+ wandb login
57
+ huggingface-cli login
58
+ ```
59
+
60
+ ## Citation
61
+
62
+ If you find our work helpful or inspiring, please feel free to cite it:
63
+
64
+ ```bibtex
65
+ @misc{chuang2025lookfocusactefficient,
66
+ title={Look, Focus, Act: Efficient and Robust Robot Learning via Human Gaze and Foveated Vision Transformers},
67
+ author={Ian Chuang and Andrew Lee and Dechen Gao and Jinyu Zou and Iman Soltani},
68
+ year={2025},
69
+ eprint={2507.15833},
70
+ archivePrefix={arXiv},
71
+ primaryClass={cs.RO},
72
+ url={https://arxiv.org/abs/2507.15833},
73
+ }
74
+ ```