--- tags: - model_hub_mixin - pytorch_model_hub_mixin license: mit datasets: - carolineec/CyclePrefDB-I2T - carolineec/CyclePrefDB-T2I language: - en --- # Model Card for CycleReward-Combo [Project page](https://cyclereward.github.io) | [Paper](https://huggingface.co/papers/2506.02095) | [Code](https://github.com/hjbahng/cyclereward) Reward model for image-text alignment trained on both image-to-text and text-to-image comparison pairs from [CyclePrefDB-I2T](https://huggingface.co/datasets/carolineec/CyclePrefDB-I2T) and [CyclePrefDB-T2I](https://huggingface.co/datasets/carolineec/CyclePrefDB-T2I) datasets. This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration. ## Loading the model Download the `model.py`, `med_config.json` files and `blip` folder from this repository. You can load the pretrained model using the code below: ``` import torch from PIL import Image from model import CycleReward device='cuda' model = CycleReward.from_pretrained("carolineec/CycleReward-Combo") model.to(device) model.eval() preprocess = model.preprocess image_path = "cat.jpg" caption = "a photo of a cat" image = preprocess(Image.open(image_path)).unsqueeze(0).to(device) print('prepared data') score = model.score(image, caption) print('my score:', score.item()) ``` ## Citation ``` @article{bahng2025cyclereward, title={Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences}, author= {Bahng, Hyojin and Chan, Caroline and Durand, Fredo and Isola, Phillip}, journal={arXiv preprint arXiv:2506.02095}, year={2025} } ```