YanFang
/

GenLIP-L16-224

Model card Files Files and versions

GenLIP-L16-224 / README.md

YanFang's picture

Upload folder using huggingface_hub

de81e12 verified 24 days ago

|

history blame contribute delete

634 Bytes

	---
	license: mit
	---
	This repository serves as the official model zoo for Let ViT Speak: Generative Language-Image Pre-training.

	## Currently released models

	1. Mdels from fixed low resolution pretraining:
	- GenLIP-L16-224
	- GenLIP-So16-224
	- GenLIP-g16-224

	2. NaViT models:
	- GenLIP-L16-NaViT
	- GenLIP-So16-NaViT
	- GenLIP-g16-NaViT

	We use siglip image preprocessor for our fixed low resolution models (\-224), and use a Qwen2-VL style image preprocessor for our NaViT models (-NaViT).

	Pretraining and implementation details can be found in our codebase [[GenLIP](https://github.com/YanFangCS/GenLIP)].