tarmomatic / README.md

Update README.md

0f11959 verified 3 months ago

3.82 kB

	---
	tags:
	- music
	- video
	---

	All of the necessary models to run [tarmomatic](https://codeberg.org/jaahas/tarmomatic) local edition with ComfyUI.
	You also need [ComfyUI-GGUF custom nodes](https://github.com/city96/ComfyUI-GGUF).
	Place all of the models in their respective folders in the ComfyUI `models`-folder.


	## Installation

	1. Go to your ComfyUI directory
	2. Download with HF CLI (or git):
	```bash
	curl -LsSf https://hf.co/cli/install.sh \| bash
	hf download jaahas/tarmomatic --local-dir models
	```

	## All Models List

	\| Model Filename \| Required ComfyUI Folder \| Used In \|
	\|---\|---\|---\|
	\| `flux1-schnell-fp8.safetensors` \| `models/checkpoints` \| Flux \|
	\| `flux1-schnell-Q4_K_M.gguf` \| `models/unet` \| Flux \|
	\| `qwen_2.5_vl_7b_fp8_scaled.safetensors` \| `models/text_encoders` \| Qwen Image Edit \|
	\| `qwen_image_vae.safetensors` \| `models/vae` \| Qwen Image Edit \|
	\| `Qwen-Image-Edit-2509-Lightning-4steps-V1.0-bf16.safetensors` \| `models/loras` \| Qwen Image Edit \|
	\| `Qwen-Image-Edit-2509-Q4_K_M.gguf` \| `models/unet` \| Qwen Image Edit \|
	\| `t5xxl_fp8_e4m3fn_scaled.safetensors` \| `models/text_encoders` \| LTX Video \|
	\| `ltxv-2b-0.9.8-distilled-fp8.safetensors` \| `models/checkpoints` \| LTX Video \|
	\| `umt5_xxl_fp8_e4m3fn_scaled.safetensors` \| `models/text_encoders` \| Wan Video \|
	\| `wan2.2_vae.safetensors` \| `models/vae` \| Wan Video \|
	\| `Wan2.2-TI2V-5B-Q5_K_M.gguf` \| `models/unet` \| Wan Video \|

	---

	## Benchmarks (RTX 5090, cold start)

	\| Model \| Speed \|
	\|---\|---\|
	\| Flux Schnell Q4_K_M (1024x1024) \| 28s \|
	\| Qwen Image Edit 2509 Q4_K_M Lightning (1 image, 1024x1024) \| 112s \|
	\| Wan 2.2 TI2V 5B Q5_K_M (10s, 720p) \| 460s \|
	\| Wan 2.2 TI2V 5B Q5_K_M (10s, 720p, optimised) \| 148s \|
	\| LTXV 2b 0.9.8 distilled fp8 (10s, 512p) \| 47s \|
	\| TBA \| --- \|
	\| Wan 2.2 I2V A14B Q5_K_M Lightning (10s, 720p) \| 1074s \|
	\| Wan 2.2 I2V A14B Q5_K_M Lightning (10s, 480p) \| 296s \|
	\| Eigen Banana Qwen Image Edit 2509 Q4_K_M (1 image, 1024x1024) \| 151s \|


	---

	## Flux Models
	Used for general image generation (Workflow: `flux_schnell-GGUF.json`).

	- `flux1-schnell-fp8.safetensors`
	- Folder: `models/checkpoints`
	- Note: This provides the CLIP (Text Encoder) and VAE for the workflow.
	- `flux1-schnell-Q4_K_M.gguf`
	- Folder: `models/unet`
	- Note: This provides the actual diffusion model (UNet) in a compressed (quantized) format for better performance.

	## Qwen Image Models
	Used for image editing and synthesis (Workflows: `image_qwen_image_edit_2509-GGUF-*.json`).

	- `qwen_2.5_vl_7b_fp8_scaled.safetensors`
	- Folder: `models/text_encoders`
	- `qwen_image_vae.safetensors`
	- Folder: `models/vae`
	- `Qwen-Image-Edit-2509-Lightning-4steps-V1.0-bf16.safetensors`
	- Folder: `models/loras`
	- `Qwen-Image-Edit-2509-Q4_K_M.gguf`
	- Folder: `models/unet`

	## LTX Models
	Used for image-to-video generation (Workflow: `ltxv_image_to_video.json`).

	- `t5xxl_fp8_e4m3fn_scaled.safetensors`
	- Folder: `models/text_encoders`
	- `ltxv-2b-0.9.8-distilled-fp8.safetensors`
	- Folder: `models/checkpoints`

	## Wan Models
	Used for video generation (Workflow: `video_wan2_2_5B_ti2v-GGUF.json`).

	- `umt5_xxl_fp8_e4m3fn_scaled.safetensors`
	- Folder: `models/text_encoders`
	- `wan2.2_vae.safetensors`
	- Folder: `models/vae`
	- `Wan2.2-TI2V-5B-Q5_K_M.gguf`
	- Folder: `models/unet`

	---

	## FAQ

	### Why does Flux need both a GGUF and a Checkpoint?
	The workflow uses a "hybrid" loading strategy:
	1. Checkpoint (`flux1-schnell-fp8.safetensors`): Loads the CLIP (text understanding) and VAE (image decoding) components.
	2. GGUF (`flux1-schnell-Q4_K_M.gguf`): Loads the UNet (image generation core).
	This setup allows you to use a highly compressed, fast UNet (GGUF) while still getting the necessary support components from the standard checkpoint.