Spaces:

UMCU
/

PerplexityViewer

Sleeping

App Files Files Community

PerplexityViewer / MLM_EXPLANATION.md

Bram van Es

bla

ef12530 about 1 month ago

preview code

raw

history blame contribute delete

6.33 kB

	# 🎭 MLM Probability Fix - Complete Documentation

	## Issue Identified
	The user correctly observed that changing the MLM probability did not affect the results at all in the encoder model visualization. This was a significant bug in how the MLM probability parameter was being used.

	## Root Cause Analysis

	### What Was Wrong
	The MLM probability setting had two separate effects that were not properly connected:

	1. Average Perplexity Calculation ✅ (Working correctly)
	- Used random masking with the specified MLM probability
	- Affected the summary statistic shown to the user

	2. Per-Token Visualization ❌ (Bug was here)
	- Always masked each token individually
	- Completely ignored the MLM probability setting
	- This meant changing MLM probability had no visual effect

	### The Disconnect
	```python
	# OLD CODE - MLM probability was ignored for visualization
	for i in range(len(tokens)):
	if not special_token:
	# ALWAYS calculated detailed perplexity for every token
	masked_input[0, i] = tokenizer.mask_token_id
	# ... calculate perplexity
	```

	## The Fix

	### 1. Made MLM Probability Affect Visualization
	Now the MLM probability controls which tokens get detailed analysis:

	```python
	# NEW CODE - MLM probability affects visualization
	for i in range(len(tokens)):
	if not special_token:
	if torch.rand(1).item() < mlm_probability: # ✅ Now respects MLM prob
	# Calculate detailed perplexity for this token
	masked_input[0, i] = tokenizer.mask_token_id
	# ... calculate detailed perplexity
	else:
	# Use baseline perplexity for non-analyzed tokens
	token_perplexities.append(2.0) # Neutral baseline
	```

	### 2. Visual Distinction
	- Analyzed tokens: Colored by actual perplexity (green/yellow/red)
	- Non-analyzed tokens: Gray color with baseline perplexity
	- Tooltip: Shows whether token was analyzed or not

	### 3. Clear User Feedback
	- Summary now shows: `MLM Probability: 0.15 (3/8 tokens analyzed in detail)`
	- Legend updated: `🟢 Low → 🟡 Medium → 🔴 High → ⚫ Not analyzed`
	- Improved help text: "Probability of detailed analysis per token"

	## How It Works Now

	### Low MLM Probability (0.15)
	```
	Input: "The capital of France is Paris"
	Result: Only ~15% of tokens get detailed analysis
	Visualization: Mostly gray tokens with a few colored ones
	Effect: Fast analysis, matches BERT training conditions
	```

	### High MLM Probability (0.5)
	```
	Input: "The capital of France is Paris"
	Result: ~50% of tokens get detailed analysis
	Visualization: More colored tokens, fewer gray ones
	Effect: More comprehensive but slower analysis
	```

	## User Experience Improvements

	### Before the Fix
	- User changes MLM probability from 0.15 → 0.5
	- No visual change in token colors
	- Only summary statistic changed (confusing!)

	### After the Fix
	- User changes MLM probability from 0.15 → 0.5
	- More tokens become colored (analyzed)
	- Fewer tokens remain gray (non-analyzed)
	- Summary shows token count: "(3/8 tokens analyzed)"
	- Clear visual feedback of the parameter's effect

	## Testing the Fix

	### 1. Quick Test
	Try the same text with different MLM probabilities:
	- Text: "Machine learning algorithms require computational resources"
	- MLM 0.2: Few colored tokens
	- MLM 0.8: Most tokens colored

	### 2. Demo Script
	```bash
	python mlm_demo.py
	```
	Shows exactly how MLM probability affects analysis.

	### 3. Visual Examples
	The app now includes example pairs:
	- Same text with MLM 0.2 vs 0.8
	- Shows clear visual difference

	## Technical Details

	### Randomness Handling
	- Uses `torch.rand()` for consistency with PyTorch
	- Each token gets independent random chance
	- Reproducible with manual seeds for testing

	### Baseline Perplexity
	- Non-analyzed tokens get perplexity = 2.0
	- This represents "neutral" confidence
	- Avoids misleading very low/high values

	### Color Mapping
	- Analyzed tokens: Full color spectrum based on actual perplexity
	- Non-analyzed tokens: Gray (`rgb(200, 200, 200)`)
	- Tooltips distinguish: "Perplexity: 5.2" vs "Not analyzed"

	## Performance Implications

	### Lower MLM Probability (0.15)
	- Pros: Faster, matches BERT training, realistic
	- Cons: Sparse analysis, some tokens not evaluated

	### Higher MLM Probability (0.8)
	- Pros: Comprehensive analysis, more visual information
	- Cons: Slower computation, unrealistic for MLM

	### Recommendation
	- Default 0.15: Standard BERT-like analysis
	- Increase to 0.3-0.5: For more detailed exploration
	- Avoid >0.8: Diminishing returns, very slow

	## Impact on Model Types

	### Decoder Models (GPT, etc.)
	- No change: MLM probability only affects encoder models
	- Always analyze all tokens for next-token prediction

	### Encoder Models (BERT, etc.)
	- Major improvement: MLM probability now has clear visual effect
	- Users can explore different analysis depths
	- Better understanding of model confidence patterns

	## User Guidance

	### When to Use Different MLM Probabilities

	0.15 (Standard)
	- Quick analysis
	- Matches BERT training
	- Good for initial exploration

	0.3-0.4 (Detailed)
	- More comprehensive view
	- Better for understanding difficult texts
	- Reasonable computation time

	0.5+ (Comprehensive)
	- Maximum detail
	- Research/analysis purposes
	- Slower but thorough

	## Future Enhancements

	### Possible Improvements
	1. Adaptive MLM: Adjust probability based on text difficulty
	2. Token importance: Prioritize content words over function words
	3. Interactive selection: Let users click tokens to analyze
	4. Batch analysis: Process multiple MLM probabilities simultaneously

	### Configuration Options
	The fix is fully configurable via `config.py`:
	- Default MLM probability
	- Min/max ranges
	- Baseline perplexity value
	- Color scheme for non-analyzed tokens

	## Conclusion

	This fix transforms the MLM probability from a "hidden parameter" that only affected summary statistics into a visible, interactive control that directly impacts the visualization. Users now get immediate visual feedback when adjusting MLM probability, making the parameter's purpose clear and the analysis more engaging.

	The fix maintains backward compatibility while significantly improving the user experience for encoder model analysis. 🎉