CycleReward-Combo / README.md
carolineec's picture
Update README.md
a4d1dcd verified
metadata
tags:
  - model_hub_mixin
  - pytorch_model_hub_mixin
license: mit
datasets:
  - carolineec/CyclePrefDB-I2T
  - carolineec/CyclePrefDB-T2I
language:
  - en

Model Card for CycleReward-Combo

Project page | Paper | Code

Reward model for image-text alignment trained on both image-to-text and text-to-image comparison pairs from CyclePrefDB-I2T and CyclePrefDB-T2I datasets.

This model has been pushed to the Hub using the PytorchModelHubMixin integration.

Loading the model

Download the model.py, med_config.json files and blip folder from this repository. You can load the pretrained model using the code below:

import torch
from PIL import Image
from model import CycleReward

device='cuda'
model = CycleReward.from_pretrained("carolineec/CycleReward-Combo")
model.to(device)
model.eval()

preprocess = model.preprocess
image_path = "cat.jpg"
caption = "a photo of a cat"
image = preprocess(Image.open(image_path)).unsqueeze(0).to(device)
print('prepared data')

score = model.score(image, caption) 
print('my score:', score.item())

Citation

@article{bahng2025cyclereward,
title={Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences},
author= {Bahng, Hyojin and Chan, Caroline and Durand, Fredo and Isola, Phillip},
journal={arXiv preprint arXiv:2506.02095},
year={2025}
}