Spaces:

Alogotron
/

NeuroScope

Runtime error

App Files Files Community

NeuroScope / README.md

Alogotron

Upload README.md with huggingface_hub

fae33a6 verified about 1 month ago

preview code

raw

history blame contribute delete

7.77 kB

	---
	title: NeuroScope
	emoji: 🧠
	colorFrom: indigo
	colorTo: yellow
	sdk: gradio
	sdk_version: 6.12.0
	app_file: app.py
	pinned: false
	license: mit
	tags:
	- activation-visualization
	- interpretability
	- attention
	- hidden-states
	- qwen
	- transformer
	- mechanistic-interpretability
	- neural-network
	---

	# 🧠 NeuroScope — Neural Network Activation Visualizer

	See inside a large language model while it thinks.

	NeuroScope is an interactive dashboard for visualizing the internal representations of [Qwen3-4B](https://huggingface.co/Qwen/Qwen3-4B) during inference. Watch attention patterns form, track how token representations evolve through 36 transformer layers, and discover how the network processes language — all in real time.

	Built for mechanistic interpretability researchers, ML students, and anyone curious about what happens inside a transformer.

	## ✨ Visualizations

	### 🔍 Attention Heatmap

	A token×token matrix showing where the model "looks" at each layer and head.

	What to look for:
	- Diagonal patterns → tokens attending to themselves (self-reinforcement)
	- First-column stripes → BOS/sink attention — many heads route unused attention to the first token as a "no-op" dump
	- Sub-diagonal bands → previous-token or induction heads that implement copying and in-context learning
	- Block patterns → semantic phrase grouping (articles attending to their nouns, verbs to their subjects)
	- Switch between all 32 individual heads or view average/max aggregation to spot global patterns

	### 📊 Activation Magnitude

	Bar chart tracking the L2 norm of hidden states across all 36 layers.

	What to look for:
	- Gradual magnitude growth → normal residual stream accumulation through depth
	- Sudden spikes → layers performing heavy computation, often aligned with MLP activity
	- ⭐ Gold-highlighted layers 9, 18, 27 → the three layers used by the [Activation Avatars](https://huggingface.co/Alogotron/Milady-Avatar-Adapter) project for mapping neural activity to avatar expressions
	- Embedding→final ratio → how much the network amplifies representations end-to-end
	- Try different metrics: mean L2 for overall energy, max L2 for peak activations, mean absolute for raw magnitude

	### 🌡️ Token × Layer Grid

	A heatmap with tokens as columns and layers as rows, revealing how each token's activation strength evolves through the full network depth.

	What to look for:
	- Bright vertical columns → tokens that stay strongly activated throughout (content words, pivotal tokens)
	- Dim columns → function words or tokens the network processes minimally
	- Horizontal bright bands → layers that uniformly boost all tokens (computation-heavy layers)
	- Phase transitions → abrupt changes in activation patterns at certain depth thresholds, often near layers 9, 18, or 27
	- Try different normalization modes: global for absolute comparison, per_token to see each token's depth journey, per_layer to highlight within-layer variation

	### 🎯 Token Representation Space (PCA/UMAP)

	2D scatter plot of token hidden states projected via PCA or UMAP, showing how tokens organize in representation space.

	What to look for:
	- Semantic clustering → similar-meaning tokens pulled together (adjectives grouping, nouns grouping)
	- Positional gradients → early vs. late tokens occupying different regions
	- Layer trajectories → overlay multiple layers (e.g., `0,9,18,27,35`) to watch tokens "move" through representation space as they pass through the network. Tokens typically start clustered at the embedding layer and progressively separate
	- Outlier tokens → special tokens (BOS, punctuation) often sit far from content tokens

	## 🚀 Quick Start

	### Demo Mode (no GPU required)

	```bash
	pip install numpy plotly gradio
	python app.py
	# Opens at http://localhost:7860
	```

	Demo mode generates realistic synthetic data matching Qwen3-4B's full dimensions (36 layers, 32 heads, 2560 hidden dim) with structured patterns including head specialization and depth-dependent magnitude growth — perfect for exploring the UI without a GPU.

	### Real Model Inference

	```bash
	pip install -r requirements.txt
	python app.py --model
	# Loads Qwen3-4B with 4-bit quantization (~3 GB VRAM)
	```

	For full-precision on high-VRAM setups (recommended for research):

	```bash
	python app.py --model --no-quantize
	# Loads in bf16 (~8 GB VRAM)
	```

	### CLI Options

	\| Flag \| Description \| Default \|
	\|------\|-------------\|---------\|
	\| `--model` \| Load Qwen3-4B for real inference (requires GPU) \| Off (demo mode) \|
	\| `--model-name` \| HuggingFace model name or path \| `Qwen/Qwen3-4B` \|
	\| `--no-quantize` \| Use bf16 instead of 4-bit quantization \| Quantized \|
	\| `--port` \| Server port \| `7860` \|
	\| `--share` \| Create a public Gradio share link \| Off \|

	## 🏗️ Architecture

	### Model: Qwen3-4B

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Hidden layers \| 36 \|
	\| Attention heads \| 32 (GQA with 8 KV heads) \|
	\| Hidden dimension \| 2560 \|
	\| Head dimension \| 80 \|
	\| Positional encoding \| RoPE \|
	\| MLP type \| SwiGLU \|
	\| Total parameters \| ~4B \|

	### Layers 9, 18, 27 — Activation Avatars Anchor Points

	NeuroScope highlights three layers that serve as anchor points for the [Activation Avatars](https://huggingface.co/Alogotron/Milady-Avatar-Adapter) project, which maps LLM hidden states to real-time avatar expressions:

	- Layer 9 (~25% depth) — Captures syntactic structure and grammatical relationships
	- Layer 18 (~50% depth) — Encodes semantic meaning and contextual understanding
	- Layer 27 (~75% depth) — Represents higher-level pragmatic and emotional content

	These layers were selected through probing experiments showing they capture complementary levels of linguistic abstraction, making them ideal anchor points for mapping neural activity to expressive visual representations.

	### Extraction Pipeline

	- Uses HuggingFace native `output_attentions=True` + `output_hidden_states=True`
	- Requires `attn_implementation='eager'` (PyTorch SDPA doesn't support attention extraction)
	- All inference under `torch.no_grad()` for memory efficiency
	- Optional 4-bit NF4 quantization via `bitsandbytes`

	### Visualization Stack

	- Plotly for all interactive charts (hover, zoom, pan) via Gradio's `gr.Plot()`
	- Custom dark colorscales: purple-to-gold for activation intensity, dark-to-gold for attention weights
	- Built-in PCA implementation (zero external dependencies) with optional UMAP fallback
	- Responsive 2×2 grid layout with per-visualization interactive controls

	## 📁 Project Structure

	```
	activation-visualizer/
	├── app.py # Gradio Blocks application (entry point)
	├── extraction.py # Model loading, inference, and demo data generation
	├── viz_attention.py # Attention heatmap visualization
	├── viz_magnitude.py # Activation magnitude bar chart
	├── viz_token_layer.py # Token × layer activation grid
	├── viz_scatter.py # PCA/UMAP scatter plot
	├── requirements.txt # Pinned dependencies
	└── README.md
	```

	## 🎯 Part of the Alogotron Ecosystem

	NeuroScope is built by [Alogotron](https://huggingface.co/Alogotron) as a diagnostic and research tool for the Activation Avatars project.

	Related projects:
	- 🖼️ [Milady-Avatar-Dataset](https://huggingface.co/datasets/Alogotron/Milady-Avatar-Dataset) — Training data for avatar generation
	- 🤖 [Milady-Avatar-Adapter](https://huggingface.co/Alogotron/Milady-Avatar-Adapter) — Neural adapter checkpoints
	- 🎮 [GameTheory-Bench](https://huggingface.co/datasets/Alogotron/GameTheory-Bench) — Game theory evaluation benchmark

	## 📜 License

	MIT