Spaces:

Jellyfish042
/

Compression-Lens

Running

Apply for a GPU community grant: Academic project

by Jellyfish042 - opened Jan 18

Owner Jan 18

This Space provides a granular, byte-level visualization tool to compare two distinct architectures: the state-of-the-art Transformer (Qwen3-1.7B) and the latest Linear RNN (RWKV7-1.5B).

Unlike standard benchmarks, this tool visualizes "compression as intelligence," allowing the community to inspect how different architectures handle tokenization and prediction uncertainty at the byte level.

Why a GPU is needed: Real-time inference on two 1.5B+ parameter models requires significant compute. A GPU is essential to utilize bfloat16 and Flash Attention, reducing inference latency from minutes (on CPU) to seconds, ensuring a smooth and interactive experience for researchers exploring these architectures.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment