JoyAI-Image-Edit — NF4 4-bit Pre-Quantized Transformer

Pre-quantized NF4 (4-bit) version of the JoyAI-Image-Edit DiT transformer, created with bitsandbytes.

Key Details

Property	Value
File size	7.83 GB (vs 31 GB bf16, 16 GB FP8)
Quantization	NF4 (bitsandbytes, double quantization)
Layers quantized	326 nn.Linear → Linear4bit
Source weights	SanDiegoDude/JoyAI-Image-Edit-Safetensors (bf16)
Tested on	NVIDIA RTX 4090 (24 GB), NVIDIA GB10

This file loads directly in seconds — no runtime quantization needed.

Inference Tool

A Gradio UI, CLI, and REST API are available at SanDiegoDude/JoyAI-Image.

Quick Start

git clone https://github.com/SanDiegoDude/JoyAI-Image.git
cd JoyAI-Image
python -m venv .venv && source .venv/bin/activate
pip install -e . && pip install bitsandbytes

# Gradio UI — auto-downloads this NF4 checkpoint + VAE + text encoder
python app.py --nf4-dit --4bit-vlm

# CLI
python inference.py --prompt "your prompt" --nf4-dit --4bit-vlm

Models are auto-downloaded from HuggingFace on first run.

VRAM Usage (approximate)

Component	VRAM
NF4 DiT (this file)	~7.8 GB (resident on GPU)
4-bit VLM text encoder	~4.4 GB (offloaded after encoding)
VAE decode	~0.5 GB
Peak during denoising	~12 GB

Fits comfortably on 24 GB GPUs.

Related Repos

bf16 weights (source): SanDiegoDude/JoyAI-Image-Edit-Safetensors
FP8 weights: SanDiegoDude/JoyAI-Image-Edit-FP8
Inference code: SanDiegoDude/JoyAI-Image
Original model: jdopensource/JoyAI-Image-Edit

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for SanDiegoDude/JoyAI-Image-Edit-NF4

Base model

jdopensource/JoyAI-Image-Edit

Finetuned

(4)

this model