NeuroScope / README.md
Alogotron's picture
Upload README.md with huggingface_hub
fae33a6 verified
---
title: NeuroScope
emoji: ๐Ÿง 
colorFrom: indigo
colorTo: yellow
sdk: gradio
sdk_version: 6.12.0
app_file: app.py
pinned: false
license: mit
tags:
- activation-visualization
- interpretability
- attention
- hidden-states
- qwen
- transformer
- mechanistic-interpretability
- neural-network
---
# ๐Ÿง  NeuroScope โ€” Neural Network Activation Visualizer
**See inside a large language model while it thinks.**
NeuroScope is an interactive dashboard for visualizing the internal representations of [Qwen3-4B](https://huggingface.co/Qwen/Qwen3-4B) during inference. Watch attention patterns form, track how token representations evolve through 36 transformer layers, and discover how the network processes language โ€” all in real time.
Built for mechanistic interpretability researchers, ML students, and anyone curious about what happens inside a transformer.
## โœจ Visualizations
### ๐Ÿ” Attention Heatmap
A tokenร—token matrix showing where the model "looks" at each layer and head.
**What to look for:**
- **Diagonal patterns** โ†’ tokens attending to themselves (self-reinforcement)
- **First-column stripes** โ†’ BOS/sink attention โ€” many heads route unused attention to the first token as a "no-op" dump
- **Sub-diagonal bands** โ†’ previous-token or induction heads that implement copying and in-context learning
- **Block patterns** โ†’ semantic phrase grouping (articles attending to their nouns, verbs to their subjects)
- Switch between all 32 individual heads or view average/max aggregation to spot global patterns
### ๐Ÿ“Š Activation Magnitude
Bar chart tracking the L2 norm of hidden states across all 36 layers.
**What to look for:**
- **Gradual magnitude growth** โ†’ normal residual stream accumulation through depth
- **Sudden spikes** โ†’ layers performing heavy computation, often aligned with MLP activity
- **โญ Gold-highlighted layers 9, 18, 27** โ†’ the three layers used by the [Activation Avatars](https://huggingface.co/Alogotron/Milady-Avatar-Adapter) project for mapping neural activity to avatar expressions
- **Embeddingโ†’final ratio** โ†’ how much the network amplifies representations end-to-end
- Try different metrics: *mean L2* for overall energy, *max L2* for peak activations, *mean absolute* for raw magnitude
### ๐ŸŒก๏ธ Token ร— Layer Grid
A heatmap with tokens as columns and layers as rows, revealing how each token's activation strength evolves through the full network depth.
**What to look for:**
- **Bright vertical columns** โ†’ tokens that stay strongly activated throughout (content words, pivotal tokens)
- **Dim columns** โ†’ function words or tokens the network processes minimally
- **Horizontal bright bands** โ†’ layers that uniformly boost all tokens (computation-heavy layers)
- **Phase transitions** โ†’ abrupt changes in activation patterns at certain depth thresholds, often near layers 9, 18, or 27
- Try different normalization modes: *global* for absolute comparison, *per_token* to see each token's depth journey, *per_layer* to highlight within-layer variation
### ๐ŸŽฏ Token Representation Space (PCA/UMAP)
2D scatter plot of token hidden states projected via PCA or UMAP, showing how tokens organize in representation space.
**What to look for:**
- **Semantic clustering** โ†’ similar-meaning tokens pulled together (adjectives grouping, nouns grouping)
- **Positional gradients** โ†’ early vs. late tokens occupying different regions
- **Layer trajectories** โ†’ overlay multiple layers (e.g., `0,9,18,27,35`) to watch tokens "move" through representation space as they pass through the network. Tokens typically start clustered at the embedding layer and progressively separate
- **Outlier tokens** โ†’ special tokens (BOS, punctuation) often sit far from content tokens
## ๐Ÿš€ Quick Start
### Demo Mode (no GPU required)
```bash
pip install numpy plotly gradio
python app.py
# Opens at http://localhost:7860
```
Demo mode generates realistic synthetic data matching Qwen3-4B's full dimensions (36 layers, 32 heads, 2560 hidden dim) with structured patterns including head specialization and depth-dependent magnitude growth โ€” perfect for exploring the UI without a GPU.
### Real Model Inference
```bash
pip install -r requirements.txt
python app.py --model
# Loads Qwen3-4B with 4-bit quantization (~3 GB VRAM)
```
For full-precision on high-VRAM setups (recommended for research):
```bash
python app.py --model --no-quantize
# Loads in bf16 (~8 GB VRAM)
```
### CLI Options
| Flag | Description | Default |
|------|-------------|---------|
| `--model` | Load Qwen3-4B for real inference (requires GPU) | Off (demo mode) |
| `--model-name` | HuggingFace model name or path | `Qwen/Qwen3-4B` |
| `--no-quantize` | Use bf16 instead of 4-bit quantization | Quantized |
| `--port` | Server port | `7860` |
| `--share` | Create a public Gradio share link | Off |
## ๐Ÿ—๏ธ Architecture
### Model: Qwen3-4B
| Parameter | Value |
|-----------|-------|
| Hidden layers | 36 |
| Attention heads | 32 (GQA with 8 KV heads) |
| Hidden dimension | 2560 |
| Head dimension | 80 |
| Positional encoding | RoPE |
| MLP type | SwiGLU |
| Total parameters | ~4B |
### Layers 9, 18, 27 โ€” Activation Avatars Anchor Points
NeuroScope highlights three layers that serve as anchor points for the [Activation Avatars](https://huggingface.co/Alogotron/Milady-Avatar-Adapter) project, which maps LLM hidden states to real-time avatar expressions:
- **Layer 9** (~25% depth) โ€” Captures syntactic structure and grammatical relationships
- **Layer 18** (~50% depth) โ€” Encodes semantic meaning and contextual understanding
- **Layer 27** (~75% depth) โ€” Represents higher-level pragmatic and emotional content
These layers were selected through probing experiments showing they capture complementary levels of linguistic abstraction, making them ideal anchor points for mapping neural activity to expressive visual representations.
### Extraction Pipeline
- Uses HuggingFace native `output_attentions=True` + `output_hidden_states=True`
- Requires `attn_implementation='eager'` (PyTorch SDPA doesn't support attention extraction)
- All inference under `torch.no_grad()` for memory efficiency
- Optional 4-bit NF4 quantization via `bitsandbytes`
### Visualization Stack
- **Plotly** for all interactive charts (hover, zoom, pan) via Gradio's `gr.Plot()`
- Custom dark colorscales: purple-to-gold for activation intensity, dark-to-gold for attention weights
- Built-in PCA implementation (zero external dependencies) with optional UMAP fallback
- Responsive 2ร—2 grid layout with per-visualization interactive controls
## ๐Ÿ“ Project Structure
```
activation-visualizer/
โ”œโ”€โ”€ app.py # Gradio Blocks application (entry point)
โ”œโ”€โ”€ extraction.py # Model loading, inference, and demo data generation
โ”œโ”€โ”€ viz_attention.py # Attention heatmap visualization
โ”œโ”€โ”€ viz_magnitude.py # Activation magnitude bar chart
โ”œโ”€โ”€ viz_token_layer.py # Token ร— layer activation grid
โ”œโ”€โ”€ viz_scatter.py # PCA/UMAP scatter plot
โ”œโ”€โ”€ requirements.txt # Pinned dependencies
โ””โ”€โ”€ README.md
```
## ๐ŸŽฏ Part of the Alogotron Ecosystem
NeuroScope is built by [Alogotron](https://huggingface.co/Alogotron) as a diagnostic and research tool for the Activation Avatars project.
**Related projects:**
- ๐Ÿ–ผ๏ธ [Milady-Avatar-Dataset](https://huggingface.co/datasets/Alogotron/Milady-Avatar-Dataset) โ€” Training data for avatar generation
- ๐Ÿค– [Milady-Avatar-Adapter](https://huggingface.co/Alogotron/Milady-Avatar-Adapter) โ€” Neural adapter checkpoints
- ๐ŸŽฎ [GameTheory-Bench](https://huggingface.co/datasets/Alogotron/GameTheory-Bench) โ€” Game theory evaluation benchmark
## ๐Ÿ“œ License
MIT