File size: 888 Bytes
fbf0718
 
 
 
 
 
 
 
 
 
 
 
 
 
086eec4
 
 
 
83d79aa
ed67a8d
f0dc99b
19e96f2
3dd8642
5faeb5e
 
a5ce63e
ed67a8d
 
baf30ab
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
---
license: other
license_name: ideogram-4-non-commercial
license_link: https://huggingface.co/ideogram-ai/ideogram-4-fp8/blob/main/LICENSE.md
pipeline_tag: text-to-image
tags:
  - text-to-image
  - image-generation
  - diffusion
  - flow-matching
  - dit
  - ideogram
---

This is an unavoidable double quantization due to the release state of Ideogram4.

The FP8 weights were cast to FP32 with the FP8 scales, then downcast to BF16 before being converted to INT8.

For use in ComfyUI with https://github.com/BobJohnson24/ComfyUI-INT8-Fast

Speed is 1.78x faster(2.03s/it) than FP8(3.62s/it) on my 3090, without compile.

<s>~2x faster with torch compile.</s>

After further inspection, it appears there may be quality issues with torch compiling this model.

Quick comparison:

<img src="Comparison.jpg" width="1000" height="500">

<img src="Comparison2.jpg" width="1000" height="500">