Gemini Prompt Expander

A custom ModularPipelineBlocks that uses Google's Gemini API to expand short prompts into detailed, vivid image generation prompts.

Requirement

Install the Google Generative AI package:

pip install google-generativeai

Setup

Get your Gemini API key from Google AI Studio and set it as an environment variable:

export GOOGLE_API_KEY="your-api-key-here"

Usage

from diffusers.modular_pipelines import ModularPipelineBlocks

gemini_block = ModularPipelineBlocks.from_pretrained(
    "diffusers-internal-dev/gemini-prompt-expander",
    trust_remote_code=True,
)
gemini = gemini_block.init_pipeline()
output = gemini(prompt="a dog sitting by the river, watching the sunset")
print(f"{output.values['prompt']=}")

Use in Mellon

This block includes a mellon_pipeline_config.json for use with Mellon:

Drag a Dynamic Block Node from the ModularDiffusers section
Enter diffusers-internal-dev/gemini-prompt-expander as the repo_id
The node will transform to show the prompt input and expanded prompt output