File size: 3,822 Bytes

26a5f31
 
 
 
 
 
c46e562
26a5f31
c46e562
 
 
4adef27
 
1bed80b
4adef27
 
 
e56ebe8
4adef27
 
c46e562
 
 
 
 
 
5a958d0
c46e562
 
 
5a958d0
c46e562
5a958d0
c46e562
 
 
 
 
ed9ee7d
 
 
b6cdc1d
ce01232
0f11959
ce01232
 
 
77b146e
 
0f11959
 
 
ed9ee7d
 
 
c46e562
 
 
 
 
 
 
 
 
 
 
 
 
 
5a958d0
c46e562
 
 
 
 
 
 
 
 
 
 
5a958d0
c46e562
 
 
 
 
 
 
5a958d0
c46e562

---
tags:
- music
- video
---

All of the necessary models to run **[tarmomatic](https://codeberg.org/jaahas/tarmomatic)** local edition with ComfyUI.
You also need [ComfyUI-GGUF custom nodes](https://github.com/city96/ComfyUI-GGUF).
Place all of the models in their respective folders in the ComfyUI `models`-folder.


## Installation

1. Go to your ComfyUI directory
2. Download with HF CLI (or git):
  ```bash
  curl -LsSf https://hf.co/cli/install.sh | bash
  hf download jaahas/tarmomatic --local-dir models
  ```

## All Models List

| Model Filename | Required ComfyUI Folder | Used In |
|---|---|---|
| `flux1-schnell-fp8.safetensors` | `models/checkpoints` | Flux |
| `flux1-schnell-Q4_K_M.gguf` | `models/unet` | Flux |
| `qwen_2.5_vl_7b_fp8_scaled.safetensors` | `models/text_encoders` | Qwen Image Edit |
| `qwen_image_vae.safetensors` | `models/vae` | Qwen Image Edit |
| `Qwen-Image-Edit-2509-Lightning-4steps-V1.0-bf16.safetensors` | `models/loras` | Qwen Image Edit |
| `Qwen-Image-Edit-2509-Q4_K_M.gguf` | `models/unet` | Qwen Image Edit |
| `t5xxl_fp8_e4m3fn_scaled.safetensors` | `models/text_encoders` | LTX Video |
| `ltxv-2b-0.9.8-distilled-fp8.safetensors` | `models/checkpoints` | LTX Video |
| `umt5_xxl_fp8_e4m3fn_scaled.safetensors` | `models/text_encoders` | Wan Video |
| `wan2.2_vae.safetensors` | `models/vae` | Wan Video |
| `Wan2.2-TI2V-5B-Q5_K_M.gguf` | `models/unet` | Wan Video |

---

## Benchmarks (RTX 5090, cold start)

| Model | Speed |
|---|---|
| Flux Schnell Q4_K_M (1024x1024) | 28s |
| Qwen Image Edit 2509 Q4_K_M Lightning (1 image, 1024x1024) | 112s |
| Wan 2.2 TI2V 5B Q5_K_M (10s, 720p) | 460s |
| Wan 2.2 TI2V 5B Q5_K_M (10s, 720p, optimised) | 148s |
| LTXV 2b 0.9.8 distilled fp8 (10s, 512p) | 47s |
| TBA | --- |
| Wan 2.2 I2V A14B Q5_K_M Lightning (10s, 720p) | 1074s |
| Wan 2.2 I2V A14B Q5_K_M Lightning (10s, 480p) | 296s |
| Eigen Banana Qwen Image Edit 2509 Q4_K_M (1 image, 1024x1024) | 151s |


---

## Flux Models
Used for general image generation (Workflow: `flux_schnell-GGUF.json`).

- **`flux1-schnell-fp8.safetensors`**
  - Folder: `models/checkpoints`
  - **Note:** This provides the CLIP (Text Encoder) and VAE for the workflow.
- **`flux1-schnell-Q4_K_M.gguf`**
  - Folder: `models/unet`
  - **Note:** This provides the actual diffusion model (UNet) in a compressed (quantized) format for better performance.

## Qwen Image Models
Used for image editing and synthesis (Workflows: `image_qwen_image_edit_2509-GGUF-*.json`).

- **`qwen_2.5_vl_7b_fp8_scaled.safetensors`**
  - Folder: `models/text_encoders`
- **`qwen_image_vae.safetensors`**
  - Folder: `models/vae`
- **`Qwen-Image-Edit-2509-Lightning-4steps-V1.0-bf16.safetensors`**
  - Folder: `models/loras`
- **`Qwen-Image-Edit-2509-Q4_K_M.gguf`**
  - Folder: `models/unet`

## LTX Models
Used for image-to-video generation (Workflow: `ltxv_image_to_video.json`).

- **`t5xxl_fp8_e4m3fn_scaled.safetensors`**
  - Folder: `models/text_encoders`
- **`ltxv-2b-0.9.8-distilled-fp8.safetensors`**
  - Folder: `models/checkpoints`

## Wan Models
Used for video generation (Workflow: `video_wan2_2_5B_ti2v-GGUF.json`).

- **`umt5_xxl_fp8_e4m3fn_scaled.safetensors`**
  - Folder: `models/text_encoders`
- **`wan2.2_vae.safetensors`**
  - Folder: `models/vae`
- **`Wan2.2-TI2V-5B-Q5_K_M.gguf`**
  - Folder: `models/unet`

---

## FAQ

### Why does Flux need both a GGUF and a Checkpoint?
The workflow uses a "hybrid" loading strategy:
1.  **Checkpoint (`flux1-schnell-fp8.safetensors`):** Loads the **CLIP** (text understanding) and **VAE** (image decoding) components.
2.  **GGUF (`flux1-schnell-Q4_K_M.gguf`):** Loads the **UNet** (image generation core).
This setup allows you to use a highly compressed, fast UNet (GGUF) while still getting the necessary support components from the standard checkpoint.