carolineec
/

CycleReward-Combo

CycleReward-Combo

model_hub_mixin

pytorch_model_hub_mixin

Model card Files Files and versions

CycleReward-Combo / README.md

carolineec's picture

Update README.md

a4d1dcd verified 8 months ago

|

history blame contribute delete

1.7 kB

	---
	tags:
	- model_hub_mixin
	- pytorch_model_hub_mixin
	license: mit
	datasets:
	- carolineec/CyclePrefDB-I2T
	- carolineec/CyclePrefDB-T2I
	language:
	- en
	---

	# Model Card for CycleReward-Combo

	[Project page](https://cyclereward.github.io) \| [Paper](https://huggingface.co/papers/2506.02095) \| [Code](https://github.com/hjbahng/cyclereward)

	Reward model for image-text alignment trained on both image-to-text and text-to-image comparison pairs from [CyclePrefDB-I2T](https://huggingface.co/datasets/carolineec/CyclePrefDB-I2T) and [CyclePrefDB-T2I](https://huggingface.co/datasets/carolineec/CyclePrefDB-T2I) datasets.

	This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration.


	## Loading the model

	Download the `model.py`, `med_config.json` files and `blip` folder from this repository. You can load the pretrained model using the code below:


	```
	import torch
	from PIL import Image
	from model import CycleReward

	device='cuda'
	model = CycleReward.from_pretrained("carolineec/CycleReward-Combo")
	model.to(device)
	model.eval()

	preprocess = model.preprocess
	image_path = "cat.jpg"
	caption = "a photo of a cat"
	image = preprocess(Image.open(image_path)).unsqueeze(0).to(device)
	print('prepared data')

	score = model.score(image, caption)
	print('my score:', score.item())

	```

	## Citation

	```
	@article{bahng2025cyclereward,
	title={Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences},
	author= {Bahng, Hyojin and Chan, Caroline and Durand, Fredo and Isola, Phillip},
	journal={arXiv preprint arXiv:2506.02095},
	year={2025}
	}
	```