JoyAI-Image-Edit β NF4 4-bit Pre-Quantized Transformer
Pre-quantized NF4 (4-bit) version of the JoyAI-Image-Edit DiT transformer, created with bitsandbytes.
Key Details
| Property | Value |
|---|---|
| File size | 7.83 GB (vs 31 GB bf16, 16 GB FP8) |
| Quantization | NF4 (bitsandbytes, double quantization) |
| Layers quantized | 326 nn.Linear β Linear4bit |
| Source weights | SanDiegoDude/JoyAI-Image-Edit-Safetensors (bf16) |
| Tested on | NVIDIA RTX 4090 (24 GB), NVIDIA GB10 |
This file loads directly in seconds β no runtime quantization needed.
Inference Tool
A Gradio UI, CLI, and REST API are available at SanDiegoDude/JoyAI-Image.
Quick Start
git clone https://github.com/SanDiegoDude/JoyAI-Image.git
cd JoyAI-Image
python -m venv .venv && source .venv/bin/activate
pip install -e . && pip install bitsandbytes
# Gradio UI β auto-downloads this NF4 checkpoint + VAE + text encoder
python app.py --nf4-dit --4bit-vlm
# CLI
python inference.py --prompt "your prompt" --nf4-dit --4bit-vlm
Models are auto-downloaded from HuggingFace on first run.
VRAM Usage (approximate)
| Component | VRAM |
|---|---|
| NF4 DiT (this file) | ~7.8 GB (resident on GPU) |
| 4-bit VLM text encoder | ~4.4 GB (offloaded after encoding) |
| VAE decode | ~0.5 GB |
| Peak during denoising | ~12 GB |
Fits comfortably on 24 GB GPUs.
Related Repos
- bf16 weights (source): SanDiegoDude/JoyAI-Image-Edit-Safetensors
- FP8 weights: SanDiegoDude/JoyAI-Image-Edit-FP8
- Inference code: SanDiegoDude/JoyAI-Image
- Original model: jdopensource/JoyAI-Image-Edit
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support
Model tree for SanDiegoDude/JoyAI-Image-Edit-NF4
Base model
jdopensource/JoyAI-Image-Edit