ViTPain / README.md

Update README.md

32516f2 verified about 2 months ago

1.93 kB

license: mit
datasets:
  - SoroushMehraban/3D-Pain

ViTPain: Pretrained Vision Transformer for Pain Assessment

Pretrained checkpoint for ViTPain, a reference-guided Vision Transformer for automated pain intensity assessment. Trained on the 3D-Pain synthetic dataset. Use this checkpoint to fine-tune on real pain datasets (e.g. UNBC-McMaster).

Model Details

Architecture: DinoV3-large backbone + LoRA (rank=8, alpha=16)
Task: PSPI regression (0–16) and Action Unit prediction
Training: 3D-Pain synthetic faces, 150 epochs
Best checkpoint: epoch 141, validation MAE 1.859
Input: 224×224 RGB face image + optional neutral reference image

Download

pip install huggingface-hub
huggingface-cli download xinlei55555/ViTPain vitpain-epoch=141-val_regression_mae=1.859.ckpt --local-dir ./

Or in Python:

from huggingface_hub import hf_hub_download

checkpoint = hf_hub_download(
    repo_id="xinlei55555/ViTPain",
    filename="vitpain-epoch=141-val_regression_mae=1.859.ckpt"
)

Load and Use

Clone the PainGeneration repo, then:

from lib.models.vitpain import ViTPain

model = ViTPain.load_from_checkpoint(checkpoint)
model.eval()
# Input: pain image + neutral reference; output: pspi_pred (0–1, scale to 0–16), aus_pred

Fine-tuning on UNBC-McMaster

The ViTPain checkpoints in this repository are the results of pretraining on the 3DPain dataset. A fine-tuned version on the UNBC-McMaster dataset will be released shortly.

Citation

@article{lin2025pain,
  title={Pain in 3D: Generating Controllable Synthetic Faces for Automated Pain Assessment},
  author={Lin, Xin Lei and Mehraban, Soroush and Moturu, Abhishek and Taati, Babak},
  journal={arXiv preprint arXiv:2509.16727},
  year={2025}
}

License

MIT