RedstoneWhite commited on
Commit
2f86691
·
verified ·
1 Parent(s): 9d0ae80

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -6,7 +6,7 @@ base_model:
6
 
7
  # What's New
8
 
9
- This repo contains an FP8 quantized HunyuanImage-3.0-Instruct-Distil using LLM-compressor with a similar recipe from [HunyuanImage-3.0-Instruct-Distil-INT8-v2](https://huggingface.co/EricRollei/HunyuanImage-3.0-Instruct-Distil-INT8-v2). Model codes are patched to enable inference with FlashInfer CUTLASS FP8 MoE kernel. Tested on a DGX Spark Founder Edition with FlashInfer==0.6.8 and a modified [Comfy_HunyuanImage3](https://github.com/EricRollei/Comfy_HunyuanImage3) node in ComfyUI. Inference time with cot_recaption reduced from ~1400s to ~200s. Patched ComfyUI nodes will be uploaded soon.
10
 
11
  Feel free to open an issue if you encounter any problem when trying to use it.
12
 
 
6
 
7
  # What's New
8
 
9
+ This repo contains an FP8 quantized HunyuanImage-3.0-Instruct-Distil using LLM-compressor with a similar recipe from [HunyuanImage-3.0-Instruct-Distil-INT8-v2](https://huggingface.co/EricRollei/HunyuanImage-3.0-Instruct-Distil-INT8-v2). Model codes are patched to enable inference with FlashInfer CUTLASS FP8 MoE kernel. Tested on a DGX Spark Founder Edition with FlashInfer==0.6.8 and a modified [Comfy_HunyuanImage3](https://github.com/EricRollei/Comfy_HunyuanImage3) node in ComfyUI. Inference time with cot_recaption reduced from ~1400s to ~200s. ~~Patched ComfyUI nodes will be uploaded soon.~~ Patched nodes can be found at [Here](https://github.com/redstonewhite/Comfy_HunyuanImage3). Use it just like the original one and you should be fine.
10
 
11
  Feel free to open an issue if you encounter any problem when trying to use it.
12