Update README.md
Browse files
README.md
CHANGED
|
@@ -9,6 +9,8 @@ base_model:
|
|
| 9 |
|
| 10 |
|
| 11 |
[](https://arxiv.org/abs/2504.07198)
|
|
|
|
|
|
|
| 12 |
[](https://www.python.org/)
|
| 13 |
|
| 14 |
This is the official released weights of of the **WACV 2026 Round 1** Early Accept paper (6.4% acceptance rate) - Face-LLaVA: Facial Expression and Attribute Understanding through Instruction Tuning. Please refer to the [official github repository](https://github.com/ihp-lab/face-llava) for instructions to run inference.
|
|
@@ -21,6 +23,11 @@ The human face plays a central role in social communication, necessitating the u
|
|
| 21 |
|
| 22 |
---
|
| 23 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 24 |
## 📦 Repository Structure
|
| 25 |
|
| 26 |
```bash
|
|
@@ -78,26 +85,57 @@ The human face plays a central role in social communication, necessitating the u
|
|
| 78 |
|
| 79 |
## 🎯 Inference
|
| 80 |
|
| 81 |
-
1. Download the model weights from [
|
|
|
|
|
|
|
| 82 |
|
| 83 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 84 |
|
| 85 |
3. Run the following command for inference.
|
| 86 |
|
| 87 |
```bash
|
| 88 |
-
CUDA_VISIBLE_DEVICES=0 python inference.py --model_path="./checkpoints/
|
| 89 |
--file_path="./assets/demo_inputs/face_attr_example_1.png" --prompt="What are the facial attributes in the given image?"
|
| 90 |
```
|
| 91 |
|
| 92 |
-
4. Currently the following face perception tasks are supported along with the best modality suited for that task - Emotion(Video), Age(Image), Facial Attributes(Image), Facial Action Units(Image)
|
| 93 |
|
| 94 |
5. A list of prompts that work well for different tasks is present in `./assets/good_prompts`.
|
| 95 |
|
| 96 |
### ✅ Repository Progress
|
| 97 |
|
| 98 |
-
- [ ]
|
| 99 |
-
- [
|
| 100 |
-
- [
|
| 101 |
-
- [
|
| 102 |
-
|
| 103 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
|
| 10 |
|
| 11 |
[](https://arxiv.org/abs/2504.07198)
|
| 12 |
+
[](https://huggingface.co/chaubeyG/FaceLLaVA)
|
| 13 |
+
[](LICENSE.rst)
|
| 14 |
[](https://www.python.org/)
|
| 15 |
|
| 16 |
This is the official released weights of of the **WACV 2026 Round 1** Early Accept paper (6.4% acceptance rate) - Face-LLaVA: Facial Expression and Attribute Understanding through Instruction Tuning. Please refer to the [official github repository](https://github.com/ihp-lab/face-llava) for instructions to run inference.
|
|
|
|
| 23 |
|
| 24 |
---
|
| 25 |
|
| 26 |
+
## 📣 News
|
| 27 |
+
|
| 28 |
+
- [Oct. 2025] Initial release of official codebase and model wirghts. Stay tuned for more details and the dataset.
|
| 29 |
+
- [Sept. 2025] FaceLLaVA accepted in the first round of WACV 2026 (6.4% acceptance rate). See you in Tucson!
|
| 30 |
+
|
| 31 |
## 📦 Repository Structure
|
| 32 |
|
| 33 |
```bash
|
|
|
|
| 85 |
|
| 86 |
## 🎯 Inference
|
| 87 |
|
| 88 |
+
1. Download the model weights from [huggingface](https://huggingface.co/chaubeyG/FaceLLaVA) inside `checkpoints/` folder so that the structure becomes - `./checkpoints/FaceLLaVA`.
|
| 89 |
+
|
| 90 |
+
2. Crop the input image/video using `tools/crop_face.py` before further processing.
|
| 91 |
|
| 92 |
+
Use the following command to crop an image
|
| 93 |
+
|
| 94 |
+
```python
|
| 95 |
+
python crop_face.py \
|
| 96 |
+
--mode image \
|
| 97 |
+
--image_path "/path/to/input.jpg" \
|
| 98 |
+
--output_image_path "/path/to/output_cropped.jpg"
|
| 99 |
+
```
|
| 100 |
+
|
| 101 |
+
Use the following command to crop a video
|
| 102 |
+
```python
|
| 103 |
+
python crop_face.py \
|
| 104 |
+
--mode video \
|
| 105 |
+
--video_path "/path/to/input/video.mp4" \
|
| 106 |
+
--output_video_path "/path/to/output/cropped_video.mp4" \
|
| 107 |
+
--temp_dir "/path/to/temp"
|
| 108 |
+
```
|
| 109 |
|
| 110 |
3. Run the following command for inference.
|
| 111 |
|
| 112 |
```bash
|
| 113 |
+
CUDA_VISIBLE_DEVICES=0 python inference.py --model_path="./checkpoints/FaceLLaVA" \
|
| 114 |
--file_path="./assets/demo_inputs/face_attr_example_1.png" --prompt="What are the facial attributes in the given image?"
|
| 115 |
```
|
| 116 |
|
| 117 |
+
4. **Currently the following face perception tasks are supported along with the best modality suited for that task - Emotion(Video), Age(Image), Facial Attributes(Image), Facial Action Units(Image)**
|
| 118 |
|
| 119 |
5. A list of prompts that work well for different tasks is present in `./assets/good_prompts`.
|
| 120 |
|
| 121 |
### ✅ Repository Progress
|
| 122 |
|
| 123 |
+
- [ ] Dataset Release
|
| 124 |
+
- [x] Training Script
|
| 125 |
+
- [x] Inference Code
|
| 126 |
+
- [x] Model Weights
|
| 127 |
+
|
| 128 |
+
## ⚖️ License
|
| 129 |
+
|
| 130 |
+
This code is distributed under the USC Research license. See [LICENSE.rst](LICENSE.rst) for more details.
|
| 131 |
+
|
| 132 |
+
## 🪶 Citation
|
| 133 |
+
|
| 134 |
+
```latex
|
| 135 |
+
@article{chaubey2025face,
|
| 136 |
+
title={Face-LLaVA: Facial Expression and Attribute Understanding through Instruction Tuning},
|
| 137 |
+
author={Chaubey, Ashutosh and Guan, Xulang and Soleymani, Mohammad},
|
| 138 |
+
journal={arXiv preprint arXiv:2504.07198},
|
| 139 |
+
year={2025}
|
| 140 |
+
}
|
| 141 |
+
```
|