Text-to-Image
English
File size: 1,531 Bytes
cef3cec
 
 
 
3b5003c
5869ceb
 
cef3cec
 
5869ceb
688b41a
5869ceb
688b41a
5869ceb
688b41a
5869ceb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
---
license: mit
language:
- en
arxiv: 2603.12506
tags:
- text-to-image
---

# Naïve PAINE: Lightweight Text-to-Image Generation Improvement

**Naïve PAINE** (Prompt-Aware Inference Noise Evaluation) is a lightweight framework designed to transform random Gaussian noise into "golden noise." By adding a small, desirable perturbation derived from the text prompt, NPNet boosts the overall quality and semantic faithfulness of synthesized images.

<div align="center">

[![arXiv](https://img.shields.io/badge/arXiv-2603.12506-b31b1b.svg)](https://arxiv.org/abs/2603.12506)
[![GitHub](https://img.shields.io/badge/GitHub-Repo-blue)](https://github.com/LSU-ATHENA/Naive-PAINE)
[![Dataset](https://img.shields.io/badge/%F0%9F%93%8A%20Dataset-PAINE--Dataset-yellow)](https://huggingface.co/datasets/LSU-ATHENA/PAINE-Dataset)

</div>

## Overview
This guide provides instructions on how to use the **NPNet**, a noise prompt network that transforms random Gaussian noise into golden noise. It is lightweight enough to seamlessly fit into existing DM pipelines.

**Supported Models:**
* Stable Diffusion XL
* DreamShaper-XL-v2-Turbo
* Hunyuan-DiT
* PixArt-Sigma

## Requirements
* Python >= 3.10.0
* PyTorch (CUDA version)
* `diffusers`, `transformers`, `accelerate`, `timm`, `einops`, `safetensors`

## Installation 🚀
```bash
git clone [https://github.com/LSU-ATHENA/Naive-PAINE.git](https://github.com/LSU-ATHENA/Naive-PAINE.git)
cd Naive-PAINE
pip install diffusers transformers accelerate torch torchvision timm einops safetensors