Spaces:

Mike0021
/

diffusiongemma-26b-a4b-it

Running on Zero

App Files Files Community

diffusiongemma-26b-a4b-it / README.md

Mike0021

Document verified deployment

77baae9 verified 2 days ago

preview code

raw

history blame contribute delete

2.83 kB

A newer version of the Gradio SDK is available: 6.18.0

Upgrade

metadata

title: DiffusionGemma 26B A4B
emoji: 💎
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 6.17.3
python_version: 3.12.12
app_file: app.py
short_description: Text and image chat with DiffusionGemma
startup_duration_timeout: 1h
models:
  - google/diffusiongemma-26B-A4B-it
tags:
  - image-text-to-text
  - conversational
  - diffusion-gemma
  - gemma
  - zerogpu

DiffusionGemma 26B A4B

This Space runs google/diffusiongemma-26B-A4B-it, Google's 25.2B parameter / 3.8B active parameter discrete-diffusion, multimodal Gemma 4 model for text and image input with text output.

The app follows the model card defaults for DiffusionGemma generation: entropy-bound sampling, 48 maximum denoising steps, a 0.8 to 0.4 temperature schedule, 0.1 entropy bound, 0.005 confidence threshold, and adaptive stopping. Image input is placed before text in each user turn, and assistant thinking content is never fed back into multi-turn history.

Implementation research and deployment notes are captured in MODEL_RESEARCH.md.

Live Space: https://huggingface.co/spaces/Mike0021/diffusiongemma-26b-a4b-it

Runtime

Use Gradio on ZeroGPU with zero-a10g hardware. The model's BF16 weights are about 51.7 GB, so the app requests @spaces.GPU(size="xlarge") for the full 96 GB ZeroGPU allocation. If deploying on dedicated hardware instead, use an 80 GB class GPU or larger and keep GRADIO_SSR_MODE=false.

Recommended deployment flow:

hf upload <namespace>/<space-name> . . --type space \
  --exclude '.git/*' --exclude '__pycache__/*' --exclude '.venv/*' --exclude '*.pyc' \
  --commit-message 'Add DiffusionGemma Space'

Create the Space and set hardware/variables with huggingface_hub.HfApi or the Hub UI when the installed hf CLI does not expose those controls. This deployment uses zero-a10g, the GRADIO_SSR_MODE=false variable, and an HF_TOKEN Space secret for model downloads.

The live revision was verified with gradio_client.Client(...).view_api() followed by text and image predict() calls. A default-style text prompt returned a clean 103-token answer in 1.5s model time, and a small OCR image prompt returned BLUE7 is visible in the image.

Research Sources

Model card and usage examples: https://huggingface.co/google/diffusiongemma-26B-A4B-it
Google DiffusionGemma docs: https://ai.google.dev/gemma/docs/diffusiongemma
Hugging Face Spaces configuration: https://huggingface.co/docs/hub/spaces-config-reference
Hugging Face ZeroGPU docs: https://huggingface.co/docs/hub/spaces-zerogpu
Space iteration quickstart: https://gist.githubusercontent.com/gary149/37c955b832558837c40e1c14ff6d955d/raw/ad35807f8466378afd04d7653d53683a847b96c4/hf-spaces-agent-quickstart-compact.md