Bedovyy commited on
Commit
f6435fa
·
verified ·
1 Parent(s): 9625f2a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -7
README.md CHANGED
@@ -13,14 +13,16 @@ base_model_relation: quantized
13
 
14
  There are two models - FP8 and NVFP4Mixed.
15
 
16
- - FP8 : (***recommend***) maximize generation speed while preserving quality as much as possible.
17
- - NVFP4Mixed : (***marginal quality***) Mixture of FP8 and NVFP4.
 
 
18
 
19
 
20
  ## Generation speed
21
 
22
  Tested on
23
- - RTX5090 (400W), ComfyUI with `--fast`option, torch2.10.0+cu130
24
  - Generates 832x1216, 30steps, cfg 4.0, er sde, simple
25
 
26
  | quant | none | sage+torch.compile |
@@ -30,14 +32,13 @@ Tested on
30
  | nvfp4mix | 6.37s/4.71it/s (+12%)| 4.99s/6.01it/s (+43%) |
31
 
32
 
33
-
34
  ## Sample
35
 
36
  | quant | sample |
37
  |------------|----------------------|
38
- | bf16 | ![anima-preview-bf16](https://cdn-uploads.huggingface.co/production/uploads/63fbf6951b4b1bd4e706fed1/MUYIxQjZZxX5wGGwPHS3G.webp) |
39
- | fp8 | ![anima-preview-fp8](https://cdn-uploads.huggingface.co/production/uploads/63fbf6951b4b1bd4e706fed1/b09IufBT31yDxg_BRZR3w.webp) |
40
- | nvfp4mixed | ![anima-preview-nvfp4](https://cdn-uploads.huggingface.co/production/uploads/63fbf6951b4b1bd4e706fed1/FMcirkGeDMAdbzkniOARF.webp) |
41
 
42
 
43
  ## Quantized layers
 
13
 
14
  There are two models - FP8 and NVFP4Mixed.
15
 
16
+ - FP8 (2.4GB) : (***recommend***) maximize generation speed while preserving quality as much as possible.
17
+ - NVFP4Mixed (2.0GB): (***marginal quality***) Mixture of FP8 and NVFP4.
18
+
19
+ To use `torch.compile`, use the `TorchCompileModelAdvanced` node from KJNodes, set the mode to `max-autotune-no-cudagraphs`, and make sure `dynamic` is set to `false`.
20
 
21
 
22
  ## Generation speed
23
 
24
  Tested on
25
+ - RTX5090 (400W), ComfyUI with `--fast` option, torch2.10.0+cu130
26
  - Generates 832x1216, 30steps, cfg 4.0, er sde, simple
27
 
28
  | quant | none | sage+torch.compile |
 
32
  | nvfp4mix | 6.37s/4.71it/s (+12%)| 4.99s/6.01it/s (+43%) |
33
 
34
 
 
35
  ## Sample
36
 
37
  | quant | sample |
38
  |------------|----------------------|
39
+ | **bf16** | ![anima-preview-bf16](https://cdn-uploads.huggingface.co/production/uploads/63fbf6951b4b1bd4e706fed1/MUYIxQjZZxX5wGGwPHS3G.webp) |
40
+ | **fp8** | ![anima-preview-fp8](https://cdn-uploads.huggingface.co/production/uploads/63fbf6951b4b1bd4e706fed1/b09IufBT31yDxg_BRZR3w.webp) |
41
+ | **nvfp4mixed** | ![anima-preview-nvfp4](https://cdn-uploads.huggingface.co/production/uploads/63fbf6951b4b1bd4e706fed1/FMcirkGeDMAdbzkniOARF.webp) |
42
 
43
 
44
  ## Quantized layers