Spaces:

Sergidev
/

EmbeddingGemma-3d

Running on Zero

App Files Files Community

EmbeddingGemma-3d / README.md

Sergidev

embeddingGemma1

17be521 1 day ago

preview code

raw

history blame contribute delete

2.84 kB

A newer version of the Gradio SDK is available: 6.9.0

Upgrade

metadata

title: 3D Embed — EmbeddingGemma Visualizer
emoji: 🌌
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.33.0
app_file: app.py
short_description: Visualize EmbeddingGemma 300M embeddings in interactive 3D
pinned: true

🌌 3D Embed — EmbeddingGemma Visualizer

An interactive demo showcasing Google's EmbeddingGemma 300M — a state-of-the-art, lightweight embedding model built on the Gemma 3 architecture.

This space lets you see how the model understands language by projecting its 768-dimensional embeddings into explorable 3D space using PCA dimensionality reduction.

✨ Features

🔤 Word Galaxy — Type individual words and watch semantic clusters form in 3D. See how "cat", "dog", and "fish" cluster differently from "car", "bus", and "train".
📝 Sentence Explorer — Compare full sentences and discover how meaning shapes geometry. Similar sentences land near each other; different ones drift apart.
🔍 Semantic Search — Enter a query and a set of documents. Watch the model find the closest match and see why through spatial proximity.
🪆 Matryoshka Dimensions — Explore how MRL (Matryoshka Representation Learning) lets you truncate embeddings from 768d → 512d → 256d → 128d with minimal quality loss, visualized side-by-side.

🧠 About EmbeddingGemma

Property	Value
Parameters	308M
Embedding Dim	768 (truncatable to 512, 256, 128)
Context Window	2,048 tokens
Languages	100+
Architecture	Gemma 3 encoder (bidirectional attention)
Backbone	`sentence-transformers` compatible

EmbeddingGemma is the highest-ranking text-only multilingual embedding model under 500M parameters on the MTEB leaderboard. It uses bidirectional attention (encoder-style) rather than causal decoding, making it purpose-built for embeddings.

🚀 How It Works

Text is passed through EmbeddingGemma with task-specific prompts (e.g., "task: sentence similarity | query: ")
The model produces 768-dimensional normalized embeddings
PCA reduces these to 3 dimensions, capturing the directions of maximum variance
Plotly renders the points as an interactive 3D scatter plot

🌌 3D Embed — EmbeddingGemma Visualizer

✨ Features

🧠 About EmbeddingGemma

🚀 How It Works

📚 References