Spaces:
Sleeping
Sleeping
| # π MLM Probability Fix - Complete Documentation | |
| ## Issue Identified | |
| The user correctly observed that **changing the MLM probability did not affect the results at all** in the encoder model visualization. This was a significant bug in how the MLM probability parameter was being used. | |
| ## Root Cause Analysis | |
| ### What Was Wrong | |
| The MLM probability setting had two separate effects that were not properly connected: | |
| 1. **Average Perplexity Calculation** β (Working correctly) | |
| - Used random masking with the specified MLM probability | |
| - Affected the summary statistic shown to the user | |
| 2. **Per-Token Visualization** β (Bug was here) | |
| - Always masked each token individually | |
| - Completely ignored the MLM probability setting | |
| - This meant changing MLM probability had no visual effect | |
| ### The Disconnect | |
| ```python | |
| # OLD CODE - MLM probability was ignored for visualization | |
| for i in range(len(tokens)): | |
| if not special_token: | |
| # ALWAYS calculated detailed perplexity for every token | |
| masked_input[0, i] = tokenizer.mask_token_id | |
| # ... calculate perplexity | |
| ``` | |
| ## The Fix | |
| ### 1. Made MLM Probability Affect Visualization | |
| Now the MLM probability controls which tokens get detailed analysis: | |
| ```python | |
| # NEW CODE - MLM probability affects visualization | |
| for i in range(len(tokens)): | |
| if not special_token: | |
| if torch.rand(1).item() < mlm_probability: # β Now respects MLM prob | |
| # Calculate detailed perplexity for this token | |
| masked_input[0, i] = tokenizer.mask_token_id | |
| # ... calculate detailed perplexity | |
| else: | |
| # Use baseline perplexity for non-analyzed tokens | |
| token_perplexities.append(2.0) # Neutral baseline | |
| ``` | |
| ### 2. Visual Distinction | |
| - **Analyzed tokens**: Colored by actual perplexity (green/yellow/red) | |
| - **Non-analyzed tokens**: Gray color with baseline perplexity | |
| - **Tooltip**: Shows whether token was analyzed or not | |
| ### 3. Clear User Feedback | |
| - Summary now shows: `MLM Probability: 0.15 (3/8 tokens analyzed in detail)` | |
| - Legend updated: `π’ Low β π‘ Medium β π΄ High β β« Not analyzed` | |
| - Improved help text: "Probability of detailed analysis per token" | |
| ## How It Works Now | |
| ### Low MLM Probability (0.15) | |
| ``` | |
| Input: "The capital of France is Paris" | |
| Result: Only ~15% of tokens get detailed analysis | |
| Visualization: Mostly gray tokens with a few colored ones | |
| Effect: Fast analysis, matches BERT training conditions | |
| ``` | |
| ### High MLM Probability (0.5) | |
| ``` | |
| Input: "The capital of France is Paris" | |
| Result: ~50% of tokens get detailed analysis | |
| Visualization: More colored tokens, fewer gray ones | |
| Effect: More comprehensive but slower analysis | |
| ``` | |
| ## User Experience Improvements | |
| ### Before the Fix | |
| - User changes MLM probability from 0.15 β 0.5 | |
| - No visual change in token colors | |
| - Only summary statistic changed (confusing!) | |
| ### After the Fix | |
| - User changes MLM probability from 0.15 β 0.5 | |
| - More tokens become colored (analyzed) | |
| - Fewer tokens remain gray (non-analyzed) | |
| - Summary shows token count: "(3/8 tokens analyzed)" | |
| - Clear visual feedback of the parameter's effect | |
| ## Testing the Fix | |
| ### 1. Quick Test | |
| Try the same text with different MLM probabilities: | |
| - Text: "Machine learning algorithms require computational resources" | |
| - MLM 0.2: Few colored tokens | |
| - MLM 0.8: Most tokens colored | |
| ### 2. Demo Script | |
| ```bash | |
| python mlm_demo.py | |
| ``` | |
| Shows exactly how MLM probability affects analysis. | |
| ### 3. Visual Examples | |
| The app now includes example pairs: | |
| - Same text with MLM 0.2 vs 0.8 | |
| - Shows clear visual difference | |
| ## Technical Details | |
| ### Randomness Handling | |
| - Uses `torch.rand()` for consistency with PyTorch | |
| - Each token gets independent random chance | |
| - Reproducible with manual seeds for testing | |
| ### Baseline Perplexity | |
| - Non-analyzed tokens get perplexity = 2.0 | |
| - This represents "neutral" confidence | |
| - Avoids misleading very low/high values | |
| ### Color Mapping | |
| - Analyzed tokens: Full color spectrum based on actual perplexity | |
| - Non-analyzed tokens: Gray (`rgb(200, 200, 200)`) | |
| - Tooltips distinguish: "Perplexity: 5.2" vs "Not analyzed" | |
| ## Performance Implications | |
| ### Lower MLM Probability (0.15) | |
| - **Pros**: Faster, matches BERT training, realistic | |
| - **Cons**: Sparse analysis, some tokens not evaluated | |
| ### Higher MLM Probability (0.8) | |
| - **Pros**: Comprehensive analysis, more visual information | |
| - **Cons**: Slower computation, unrealistic for MLM | |
| ### Recommendation | |
| - **Default 0.15**: Standard BERT-like analysis | |
| - **Increase to 0.3-0.5**: For more detailed exploration | |
| - **Avoid >0.8**: Diminishing returns, very slow | |
| ## Impact on Model Types | |
| ### Decoder Models (GPT, etc.) | |
| - **No change**: MLM probability only affects encoder models | |
| - Always analyze all tokens for next-token prediction | |
| ### Encoder Models (BERT, etc.) | |
| - **Major improvement**: MLM probability now has clear visual effect | |
| - Users can explore different analysis depths | |
| - Better understanding of model confidence patterns | |
| ## User Guidance | |
| ### When to Use Different MLM Probabilities | |
| **0.15 (Standard)** | |
| - Quick analysis | |
| - Matches BERT training | |
| - Good for initial exploration | |
| **0.3-0.4 (Detailed)** | |
| - More comprehensive view | |
| - Better for understanding difficult texts | |
| - Reasonable computation time | |
| **0.5+ (Comprehensive)** | |
| - Maximum detail | |
| - Research/analysis purposes | |
| - Slower but thorough | |
| ## Future Enhancements | |
| ### Possible Improvements | |
| 1. **Adaptive MLM**: Adjust probability based on text difficulty | |
| 2. **Token importance**: Prioritize content words over function words | |
| 3. **Interactive selection**: Let users click tokens to analyze | |
| 4. **Batch analysis**: Process multiple MLM probabilities simultaneously | |
| ### Configuration Options | |
| The fix is fully configurable via `config.py`: | |
| - Default MLM probability | |
| - Min/max ranges | |
| - Baseline perplexity value | |
| - Color scheme for non-analyzed tokens | |
| ## Conclusion | |
| This fix transforms the MLM probability from a "hidden parameter" that only affected summary statistics into a **visible, interactive control** that directly impacts the visualization. Users now get immediate visual feedback when adjusting MLM probability, making the parameter's purpose clear and the analysis more engaging. | |
| The fix maintains backward compatibility while significantly improving the user experience for encoder model analysis. π |