Spaces:
Paused
Paused
| title: Autoregressive Image Token Playground | |
| emoji: 🧩 | |
| colorFrom: blue | |
| colorTo: green | |
| sdk: gradio | |
| sdk_version: 5.22.0 | |
| python_version: 3.11 | |
| app_file: app.py | |
| pinned: false | |
| license: mit | |
| # Autoregressive Image Token Playground | |
| A CPU-friendly Gradio app for teaching image tokenization and image generation as sequence problems. | |
| The first tab takes a real image and shows image tokenization two ways: | |
| - a pretrained learned VQ tokenizer from `CompVis/ldm-celebahq-256/vqvae` | |
| - a transparent k-means patch tokenizer for comparison | |
| The learned tokenizer shows real learned codebook IDs, reconstruction from the VQ decoder, token usage, and representative image regions for the most-used codes. The k-means option can learn a tiny codebook from one image or from all loaded MoMA images. | |
| The app also includes a deliberately small, transparent image-token sampler. It does not call a proprietary image model. Instead, it shows the mechanics that matter for a workshop: | |
| - an image is represented as a grid of discrete codebook tokens | |
| - generation follows a fixed order, one token at a time | |
| - each next token is sampled from visible logits | |
| - logits are split into prompt, position, and previous-token context terms | |
| - students can inspect every step, token probability table, and final token inventory | |
| The visible token tiles are abstract swatches, not source images being pasted into the output. They stand in for learned image-token codes in real autoregressive image systems. | |
| The Codebook tab gives students a compact sketch of how image patches become token IDs, how the autoregressive model predicts those IDs, and how IDs decode back into visible patches. | |
| This pairs well with a diffusion demo because students can compare two different views of generation: | |
| - diffusion gradually denoises a whole latent image | |
| - autoregressive generation fills in discrete image tokens one by one | |
| ## Running | |
| Install the requirements and run: | |
| ```bash | |
| python app.py | |
| ``` | |
| The app is intentionally lightweight: it uses `gradio`, `numpy`, and `pillow`. | |