Spaces:
Running
Running
Apply for a GPU community grant: Academic project
#1
by
Jellyfish042
- opened
This Space provides a granular, byte-level visualization tool to compare two distinct architectures: the state-of-the-art Transformer (Qwen3-1.7B) and the latest Linear RNN (RWKV7-1.5B).
Unlike standard benchmarks, this tool visualizes "compression as intelligence," allowing the community to inspect how different architectures handle tokenization and prediction uncertainty at the byte level.
Why a GPU is needed: Real-time inference on two 1.5B+ parameter models requires significant compute. A GPU is essential to utilize bfloat16 and Flash Attention, reducing inference latency from minutes (on CPU) to seconds, ensuring a smooth and interactive experience for researchers exploring these architectures.