AlphaGen v1 preview Max: The state-of-the-art image generation model for all tasks!

AlphaGen v1 preview Max is the 2nd generation, building upon the AlphaGen v1 preview model. It is still in development and research, but because this model is mostly complete, it’s being published on HuggingFace. (Opinion: Publish to CivitAI?)

Purpose

This model is designed for quality. It's very heavy, so sorry if your disk gives up with this giant . The GPU is also another important factor… Who wants a big model for art?

How it works

Here is how the model operates. Perhaps it's a bit like GANs? (The RL mechanism is explained below)

Raw prompt -> Enhance prompt with Qwen -> Generate “draft” art with Stable Diffusion 3.5 Large -> RL (Regenerating Loop) -> Output (Yay!)

Regenerating Loop (RL) Logic

The model employs an iterative scoring mechanism to ensure quality:

Draft Generation: Create an initial image based on the enhanced prompt.

Scoring: Evaluate the draft using Qwen.

If Score > 0.9: Accept and output the image immediately.

If Score < 0.9: Regenerate the image.

Retry Limit: The model attempts regeneration up to 5 times.

If no image reaches the 0.9 threshold after 5 attempts, the best result from the attempts is output.

Model structure:


-- vision_language_model
|
|
-- diffusion_model
|
|
-> model_config.json

System Requirements:

GPU: RTX 3090 or stronger

VRAM: 24GB or more

(What else???)

Example usage:

(Make sure you don't delete the AlphaGen_Pipeline.py file!)


from AlphaGen_Pipeline import AlphaGenPipeline

# Size can be “axb” or “auto”. If the output value is not provided, it will be automatically saved in the same directory as AlphaGen_Pipeline.py.

AlphaGenPipeline(prompt="A cat sitting on a sofa", size="auto", output="./output")

Happy testing!

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support