metadata
license: mit
datasets:
- SoroushMehraban/3D-Pain
ViTPain: Pretrained Vision Transformer for Pain Assessment
Pretrained checkpoint for ViTPain, a reference-guided Vision Transformer for automated pain intensity assessment. Trained on the 3D-Pain synthetic dataset. Use this checkpoint to fine-tune on real pain datasets (e.g. UNBC-McMaster).
Model Details
- Architecture: DinoV3-large backbone + LoRA (rank=8, alpha=16)
- Task: PSPI regression (0–16) and Action Unit prediction
- Training: 3D-Pain synthetic faces, 150 epochs
- Best checkpoint: epoch 141, validation MAE 1.859
- Input: 224×224 RGB face image + optional neutral reference image
Download
pip install huggingface-hub
huggingface-cli download xinlei55555/ViTPain vitpain-epoch=141-val_regression_mae=1.859.ckpt --local-dir ./
Or in Python:
from huggingface_hub import hf_hub_download
checkpoint = hf_hub_download(
repo_id="xinlei55555/ViTPain",
filename="vitpain-epoch=141-val_regression_mae=1.859.ckpt"
)
Load and Use
Clone the PainGeneration repo, then:
from lib.models.vitpain import ViTPain
model = ViTPain.load_from_checkpoint(checkpoint)
model.eval()
# Input: pain image + neutral reference; output: pspi_pred (0–1, scale to 0–16), aus_pred
Fine-tuning on UNBC-McMaster
The ViTPain checkpoints in this repository are the results of pretraining on the 3DPain dataset. A fine-tuned version on the UNBC-McMaster dataset will be released shortly.
Citation
@article{lin2025pain,
title={Pain in 3D: Generating Controllable Synthetic Faces for Automated Pain Assessment},
author={Lin, Xin Lei and Mehraban, Soroush and Moturu, Abhishek and Taati, Babak},
journal={arXiv preprint arXiv:2509.16727},
year={2025}
}
License
MIT