Spaces:

Alogotron
/

NeuroScope

Runtime error

App Files Files Community

NeuroScope / README.md

Alogotron

Upload README.md with huggingface_hub

fae33a6 verified about 1 month ago

preview code

raw

history blame contribute delete

7.77 kB

A newer version of the Gradio SDK is available: 6.15.2

Upgrade

metadata

title: NeuroScope
emoji: 🧠
colorFrom: indigo
colorTo: yellow
sdk: gradio
sdk_version: 6.12.0
app_file: app.py
pinned: false
license: mit
tags:
  - activation-visualization
  - interpretability
  - attention
  - hidden-states
  - qwen
  - transformer
  - mechanistic-interpretability
  - neural-network

🧠 NeuroScope — Neural Network Activation Visualizer

See inside a large language model while it thinks.

NeuroScope is an interactive dashboard for visualizing the internal representations of Qwen3-4B during inference. Watch attention patterns form, track how token representations evolve through 36 transformer layers, and discover how the network processes language — all in real time.

Built for mechanistic interpretability researchers, ML students, and anyone curious about what happens inside a transformer.

✨ Visualizations

🔍 Attention Heatmap

A token×token matrix showing where the model "looks" at each layer and head.

What to look for:

Diagonal patterns → tokens attending to themselves (self-reinforcement)
First-column stripes → BOS/sink attention — many heads route unused attention to the first token as a "no-op" dump
Sub-diagonal bands → previous-token or induction heads that implement copying and in-context learning
Block patterns → semantic phrase grouping (articles attending to their nouns, verbs to their subjects)
Switch between all 32 individual heads or view average/max aggregation to spot global patterns

📊 Activation Magnitude

Bar chart tracking the L2 norm of hidden states across all 36 layers.

What to look for:

Gradual magnitude growth → normal residual stream accumulation through depth
Sudden spikes → layers performing heavy computation, often aligned with MLP activity
⭐ Gold-highlighted layers 9, 18, 27 → the three layers used by the Activation Avatars project for mapping neural activity to avatar expressions
Embedding→final ratio → how much the network amplifies representations end-to-end
Try different metrics: mean L2 for overall energy, max L2 for peak activations, mean absolute for raw magnitude

🌡️ Token × Layer Grid

A heatmap with tokens as columns and layers as rows, revealing how each token's activation strength evolves through the full network depth.

What to look for:

Bright vertical columns → tokens that stay strongly activated throughout (content words, pivotal tokens)
Dim columns → function words or tokens the network processes minimally
Horizontal bright bands → layers that uniformly boost all tokens (computation-heavy layers)
Phase transitions → abrupt changes in activation patterns at certain depth thresholds, often near layers 9, 18, or 27
Try different normalization modes: global for absolute comparison, per_token to see each token's depth journey, per_layer to highlight within-layer variation

🎯 Token Representation Space (PCA/UMAP)

2D scatter plot of token hidden states projected via PCA or UMAP, showing how tokens organize in representation space.

What to look for:

Semantic clustering → similar-meaning tokens pulled together (adjectives grouping, nouns grouping)
Positional gradients → early vs. late tokens occupying different regions
Layer trajectories → overlay multiple layers (e.g., 0,9,18,27,35) to watch tokens "move" through representation space as they pass through the network. Tokens typically start clustered at the embedding layer and progressively separate
Outlier tokens → special tokens (BOS, punctuation) often sit far from content tokens

🚀 Quick Start

Demo Mode (no GPU required)

pip install numpy plotly gradio
python app.py
# Opens at http://localhost:7860

Demo mode generates realistic synthetic data matching Qwen3-4B's full dimensions (36 layers, 32 heads, 2560 hidden dim) with structured patterns including head specialization and depth-dependent magnitude growth — perfect for exploring the UI without a GPU.

Real Model Inference

pip install -r requirements.txt
python app.py --model
# Loads Qwen3-4B with 4-bit quantization (~3 GB VRAM)

For full-precision on high-VRAM setups (recommended for research):

python app.py --model --no-quantize
# Loads in bf16 (~8 GB VRAM)

CLI Options

Flag	Description	Default
`--model`	Load Qwen3-4B for real inference (requires GPU)	Off (demo mode)
`--model-name`	HuggingFace model name or path	`Qwen/Qwen3-4B`
`--no-quantize`	Use bf16 instead of 4-bit quantization	Quantized
`--port`	Server port	`7860`
`--share`	Create a public Gradio share link	Off

🏗️ Architecture

Model: Qwen3-4B

Parameter	Value
Hidden layers	36
Attention heads	32 (GQA with 8 KV heads)
Hidden dimension	2560
Head dimension	80
Positional encoding	RoPE
MLP type	SwiGLU
Total parameters	~4B

Layers 9, 18, 27 — Activation Avatars Anchor Points

NeuroScope highlights three layers that serve as anchor points for the Activation Avatars project, which maps LLM hidden states to real-time avatar expressions:

Layer 9 (~25% depth) — Captures syntactic structure and grammatical relationships
Layer 18 (~50% depth) — Encodes semantic meaning and contextual understanding
Layer 27 (~75% depth) — Represents higher-level pragmatic and emotional content

These layers were selected through probing experiments showing they capture complementary levels of linguistic abstraction, making them ideal anchor points for mapping neural activity to expressive visual representations.

Extraction Pipeline

Uses HuggingFace native output_attentions=True + output_hidden_states=True
Requires attn_implementation='eager' (PyTorch SDPA doesn't support attention extraction)
All inference under torch.no_grad() for memory efficiency
Optional 4-bit NF4 quantization via bitsandbytes

Visualization Stack

Plotly for all interactive charts (hover, zoom, pan) via Gradio's gr.Plot()
Custom dark colorscales: purple-to-gold for activation intensity, dark-to-gold for attention weights
Built-in PCA implementation (zero external dependencies) with optional UMAP fallback
Responsive 2×2 grid layout with per-visualization interactive controls

📁 Project Structure

activation-visualizer/
├── app.py              # Gradio Blocks application (entry point)
├── extraction.py       # Model loading, inference, and demo data generation
├── viz_attention.py    # Attention heatmap visualization
├── viz_magnitude.py    # Activation magnitude bar chart
├── viz_token_layer.py  # Token × layer activation grid
├── viz_scatter.py      # PCA/UMAP scatter plot
├── requirements.txt    # Pinned dependencies
└── README.md

🎯 Part of the Alogotron Ecosystem

NeuroScope is built by Alogotron as a diagnostic and research tool for the Activation Avatars project.

Related projects:

🖼️ Milady-Avatar-Dataset — Training data for avatar generation
🤖 Milady-Avatar-Adapter — Neural adapter checkpoints
🎮 GameTheory-Bench — Game theory evaluation benchmark

📜 License

MIT