Duplicated from bertbobson/Ideogram-4-INT8-ConvRot

milo01
/

Ideogram-4-INT8-ConvRot

image-generation

Model card Files Files and versions

Ideogram-4-INT8-ConvRot / README.md

milo01's picture

Duplicate from bertbobson/Ideogram-4-INT8-ConvRot

ece9fe3 24 days ago

|

History Blame Contribute Delete

888 Bytes

	---
	license: other
	license_name: ideogram-4-non-commercial
	license_link: https://huggingface.co/ideogram-ai/ideogram-4-fp8/blob/main/LICENSE.md
	pipeline_tag: text-to-image
	tags:
	- text-to-image
	- image-generation
	- diffusion
	- flow-matching
	- dit
	- ideogram
	---

	This is an unavoidable double quantization due to the release state of Ideogram4.

	The FP8 weights were cast to FP32 with the FP8 scales, then downcast to BF16 before being converted to INT8.

	For use in ComfyUI with https://github.com/BobJohnson24/ComfyUI-INT8-Fast

	Speed is 1.78x faster(2.03s/it) than FP8(3.62s/it) on my 3090, without compile.

	<s>~2x faster with torch compile.</s>

	After further inspection, it appears there may be quality issues with torch compiling this model.

	Quick comparison:

	<img src="Comparison.jpg" width="1000" height="500">

	<img src="Comparison2.jpg" width="1000" height="500">