File size: 1,376 Bytes

6ce214a

---
language:
- en
tags:
- text-to-image
- diffusion
- flux
- grpo
- alignment
pipeline_tag: text-to-image
base_model: black-forest-labs/FLUX.1-dev
---

# RealGRPO FLUX DiT Weights

This repository provides **DiT weights** fine-tuned from **FLUX.1-dev** with **GRPO** using the **RealGRPO** strategy.

RealGRPO targets a common post-training issue in image generation: **reward hacking** (e.g., over-smoothing, over-saturation, and synthetic-looking artifacts).  
Compared with vanilla FLUX and standard GRPO baselines, these weights are optimized to better preserve prompt intent while reducing reward-driven artifacts.

## What Is Included

- Fine-tuned FLUX DiT weights (GRPO post-training).
- Training objective based on contrastive positive/negative style guidance.
- Compatibility with the RealGRPO codebase inference scripts.

## Method (Brief)

RealGRPO uses a LLM to generate prompt-specific style pairs:
- positive style cues (`pos_style`)
- negative style cues (`neg_style`)

The reward encourages similarity to positive cues while penalizing negative cues, helping the model avoid artifact-prone shortcuts during alignment.

> Note: This release contains DiT alignment weights, not a standalone full pipeline package. You need download black-forest-labs/FLUX.1-dev and replace the contents of the `transfermer` directory with the contents of this repository.