chaubeyG commited on
Commit
dd1661d
·
verified ·
1 Parent(s): b83ca73

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +48 -10
README.md CHANGED
@@ -9,6 +9,8 @@ base_model:
9
 
10
 
11
  [![arXiv](https://img.shields.io/badge/arXiv-2504.07198-b31b1b.svg)](https://arxiv.org/abs/2504.07198)
 
 
12
  [![Python](https://img.shields.io/badge/Python-3.10+-blue.svg)](https://www.python.org/)
13
 
14
  This is the official released weights of of the **WACV 2026 Round 1** Early Accept paper (6.4% acceptance rate) - Face-LLaVA: Facial Expression and Attribute Understanding through Instruction Tuning. Please refer to the [official github repository](https://github.com/ihp-lab/face-llava) for instructions to run inference.
@@ -21,6 +23,11 @@ The human face plays a central role in social communication, necessitating the u
21
 
22
  ---
23
 
 
 
 
 
 
24
  ## 📦 Repository Structure
25
 
26
  ```bash
@@ -78,26 +85,57 @@ The human face plays a central role in social communication, necessitating the u
78
 
79
  ## 🎯 Inference
80
 
81
- 1. Download the model weights from [here (Use USC Email)](https://drive.google.com/file/d/1TAZE70WlqY1rQJIzdJ9x7P7IopyYSlfk/view?usp=sharing) and unzip them inside a `checkpoints/` folder so that the structure becomes - `./checkpoints/facellava-7b-wolm`.
 
 
82
 
83
- 2. ***Make sure that the input video or image is already face-cropped as the current version does not support automatic cropping.***
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
84
 
85
  3. Run the following command for inference.
86
 
87
  ```bash
88
- CUDA_VISIBLE_DEVICES=0 python inference.py --model_path="./checkpoints/facellava-7b-wolm" \
89
  --file_path="./assets/demo_inputs/face_attr_example_1.png" --prompt="What are the facial attributes in the given image?"
90
  ```
91
 
92
- 4. Currently the following face perception tasks are supported along with the best modality suited for that task - Emotion(Video), Age(Image), Facial Attributes(Image), Facial Action Units(Image)
93
 
94
  5. A list of prompts that work well for different tasks is present in `./assets/good_prompts`.
95
 
96
  ### ✅ Repository Progress
97
 
98
- - [ ] Training Script
99
- - [ ] Evaluation Metrics
100
- - [ ] Dataset Release & Preprocessing Code
101
- - [ ] Inference Code (with Landmarks & Auto Face Cropping)
102
- - [x] Inference Code (Basic)
103
- - [x] Model Weights (w/o Landmarks)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
 
10
 
11
  [![arXiv](https://img.shields.io/badge/arXiv-2504.07198-b31b1b.svg)](https://arxiv.org/abs/2504.07198)
12
+ [![Model Weights](https://img.shields.io/badge/%F0%9F%A4%97%20Weights-FaceLLaVA-orange)](https://huggingface.co/chaubeyG/FaceLLaVA)
13
+ [![License](https://img.shields.io/badge/license-USC%20Research-green)](LICENSE.rst)
14
  [![Python](https://img.shields.io/badge/Python-3.10+-blue.svg)](https://www.python.org/)
15
 
16
  This is the official released weights of of the **WACV 2026 Round 1** Early Accept paper (6.4% acceptance rate) - Face-LLaVA: Facial Expression and Attribute Understanding through Instruction Tuning. Please refer to the [official github repository](https://github.com/ihp-lab/face-llava) for instructions to run inference.
 
23
 
24
  ---
25
 
26
+ ## 📣 News
27
+
28
+ - [Oct. 2025] Initial release of official codebase and model wirghts. Stay tuned for more details and the dataset.
29
+ - [Sept. 2025] FaceLLaVA accepted in the first round of WACV 2026 (6.4% acceptance rate). See you in Tucson!
30
+
31
  ## 📦 Repository Structure
32
 
33
  ```bash
 
85
 
86
  ## 🎯 Inference
87
 
88
+ 1. Download the model weights from [huggingface](https://huggingface.co/chaubeyG/FaceLLaVA) inside `checkpoints/` folder so that the structure becomes - `./checkpoints/FaceLLaVA`.
89
+
90
+ 2. Crop the input image/video using `tools/crop_face.py` before further processing.
91
 
92
+ Use the following command to crop an image
93
+
94
+ ```python
95
+ python crop_face.py \
96
+ --mode image \
97
+ --image_path "/path/to/input.jpg" \
98
+ --output_image_path "/path/to/output_cropped.jpg"
99
+ ```
100
+
101
+ Use the following command to crop a video
102
+ ```python
103
+ python crop_face.py \
104
+ --mode video \
105
+ --video_path "/path/to/input/video.mp4" \
106
+ --output_video_path "/path/to/output/cropped_video.mp4" \
107
+ --temp_dir "/path/to/temp"
108
+ ```
109
 
110
  3. Run the following command for inference.
111
 
112
  ```bash
113
+ CUDA_VISIBLE_DEVICES=0 python inference.py --model_path="./checkpoints/FaceLLaVA" \
114
  --file_path="./assets/demo_inputs/face_attr_example_1.png" --prompt="What are the facial attributes in the given image?"
115
  ```
116
 
117
+ 4. **Currently the following face perception tasks are supported along with the best modality suited for that task - Emotion(Video), Age(Image), Facial Attributes(Image), Facial Action Units(Image)**
118
 
119
  5. A list of prompts that work well for different tasks is present in `./assets/good_prompts`.
120
 
121
  ### ✅ Repository Progress
122
 
123
+ - [ ] Dataset Release
124
+ - [x] Training Script
125
+ - [x] Inference Code
126
+ - [x] Model Weights
127
+
128
+ ## ⚖️ License
129
+
130
+ This code is distributed under the USC Research license. See [LICENSE.rst](LICENSE.rst) for more details.
131
+
132
+ ## 🪶 Citation
133
+
134
+ ```latex
135
+ @article{chaubey2025face,
136
+ title={Face-LLaVA: Facial Expression and Attribute Understanding through Instruction Tuning},
137
+ author={Chaubey, Ashutosh and Guan, Xulang and Soleymani, Mohammad},
138
+ journal={arXiv preprint arXiv:2504.07198},
139
+ year={2025}
140
+ }
141
+ ```