Spaces:
Running on Zero
A newer version of the Gradio SDK is available: 6.18.0
title: DiffusionGemma 26B A4B
emoji: 💎
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 6.17.3
python_version: 3.12.12
app_file: app.py
short_description: Text and image chat with DiffusionGemma
startup_duration_timeout: 1h
models:
- google/diffusiongemma-26B-A4B-it
tags:
- image-text-to-text
- conversational
- diffusion-gemma
- gemma
- zerogpu
DiffusionGemma 26B A4B
This Space runs google/diffusiongemma-26B-A4B-it, Google's 25.2B parameter / 3.8B active parameter discrete-diffusion, multimodal Gemma 4 model for text and image input with text output.
The app follows the model card defaults for DiffusionGemma generation: entropy-bound sampling, 48 maximum denoising steps, a 0.8 to 0.4 temperature schedule, 0.1 entropy bound, 0.005 confidence threshold, and adaptive stopping. Image input is placed before text in each user turn, and assistant thinking content is never fed back into multi-turn history.
Implementation research and deployment notes are captured in MODEL_RESEARCH.md.
Live Space: https://huggingface.co/spaces/Mike0021/diffusiongemma-26b-a4b-it
Runtime
Use Gradio on ZeroGPU with zero-a10g hardware. The model's BF16 weights are about 51.7 GB, so the app requests @spaces.GPU(size="xlarge") for the full 96 GB ZeroGPU allocation. If deploying on dedicated hardware instead, use an 80 GB class GPU or larger and keep GRADIO_SSR_MODE=false.
Recommended deployment flow:
hf upload <namespace>/<space-name> . . --type space \
--exclude '.git/*' --exclude '__pycache__/*' --exclude '.venv/*' --exclude '*.pyc' \
--commit-message 'Add DiffusionGemma Space'
Create the Space and set hardware/variables with huggingface_hub.HfApi or the Hub UI when the installed hf CLI does not expose those controls. This deployment uses zero-a10g, the GRADIO_SSR_MODE=false variable, and an HF_TOKEN Space secret for model downloads.
The live revision was verified with gradio_client.Client(...).view_api() followed by text and image predict() calls. A default-style text prompt returned a clean 103-token answer in 1.5s model time, and a small OCR image prompt returned BLUE7 is visible in the image.
Research Sources
- Model card and usage examples: https://huggingface.co/google/diffusiongemma-26B-A4B-it
- Google DiffusionGemma docs: https://ai.google.dev/gemma/docs/diffusiongemma
- Hugging Face Spaces configuration: https://huggingface.co/docs/hub/spaces-config-reference
- Hugging Face ZeroGPU docs: https://huggingface.co/docs/hub/spaces-zerogpu
- Space iteration quickstart: https://gist.githubusercontent.com/gary149/37c955b832558837c40e1c14ff6d955d/raw/ad35807f8466378afd04d7653d53683a847b96c4/hf-spaces-agent-quickstart-compact.md