Spaces:
Sleeping
Sleeping
| """ | |
| Verify Bar Chart Numbers - Shows How Category Averages Are Computed | |
| This script explains the difference between: | |
| 1. BAR CHART numbers (category averages) | |
| 2. GUI numbers (individual model correlations) | |
| And shows you how to verify both. | |
| """ | |
| import pandas as pd | |
| import numpy as np | |
| print("="*80) | |
| print("VERIFYING BAR CHART NUMBERS") | |
| print("="*80) | |
| # Load the hierarchy results | |
| results = pd.read_csv('hierarchy_analysis/model_brain_hierarchy_results.csv') | |
| print(f"\nLoaded {len(results)} models from model_brain_hierarchy_results.csv") | |
| # Categorize models | |
| def get_model_type(model_name): | |
| """Determine model type from name""" | |
| if 'BOLD5000' in model_name or 'clip' in model_name.lower(): | |
| return 'vision' | |
| elif 'deberta' in model_name or 'bert' in model_name or 'simcse' in model_name or 'roberta' in model_name: | |
| return 'language' | |
| else: | |
| return 'statistical' | |
| results['model_type_simple'] = results['model_name'].apply(get_model_type) | |
| # Count models by type | |
| type_counts = results['model_type_simple'].value_counts() | |
| print(f"\nModel counts:") | |
| print(f" Vision: {type_counts.get('vision', 0)} models") | |
| print(f" Language: {type_counts.get('language', 0)} models") | |
| print(f" Statistical: {type_counts.get('statistical', 0)} models") | |
| print("\n" + "="*80) | |
| print("BAR CHART NUMBERS (These are CATEGORY AVERAGES)") | |
| print("="*80) | |
| # Language models | |
| language_models = results[results['model_type_simple'] == 'language'] | |
| lang_early_mean = language_models['corr_early_visual'].mean() | |
| lang_late_mean = language_models['corr_late_semantic'].mean() | |
| lang_diff = lang_late_mean - lang_early_mean | |
| print(f"\nLANGUAGE MODELS (n={len(language_models)}):") | |
| print(f" Early Visual Average: r = {lang_early_mean:.4f}") | |
| print(f" Late Semantic Average: r = {lang_late_mean:.4f}") | |
| print(f" Difference: +{lang_diff:.4f}") | |
| print(f"\n ^^ These are the numbers in the bar chart for language models!") | |
| # Vision models | |
| vision_models = results[results['model_type_simple'] == 'vision'] | |
| vis_early_mean = vision_models['corr_early_visual'].mean() | |
| vis_late_mean = vision_models['corr_late_semantic'].mean() | |
| vis_diff = vis_late_mean - vis_early_mean | |
| print(f"\nVISION MODELS (n={len(vision_models)}):") | |
| print(f" Early Visual Average: r = {vis_early_mean:.4f}") | |
| print(f" Late Semantic Average: r = {vis_late_mean:.4f}") | |
| print(f" Difference: +{vis_diff:.4f}") | |
| print(f"\n ^^ These are the numbers in the bar chart for vision models!") | |
| print("\n" + "="*80) | |
| print("INDIVIDUAL MODEL EXAMPLES (What you see in GUI)") | |
| print("="*80) | |
| # Show some example language models | |
| print("\nEXAMPLE LANGUAGE MODELS (individual correlations):") | |
| language_examples = language_models.head(5) | |
| for idx, row in language_examples.iterrows(): | |
| print(f"\n {row['model_name'][:60]}:") | |
| print(f" Early: {row['corr_early_visual']:.4f}") | |
| print(f" Late: {row['corr_late_semantic']:.4f}") | |
| print(f" Diff: +{row['diff_late_minus_early']:.4f}") | |
| print("\n Notice: Each individual model has different numbers!") | |
| print(f" But when averaged: Early = {lang_early_mean:.4f}, Late = {lang_late_mean:.4f}") | |
| # Show some example vision models | |
| print("\nEXAMPLE VISION MODELS (individual correlations):") | |
| vision_examples = vision_models.head(5) | |
| for idx, row in vision_examples.iterrows(): | |
| print(f"\n {row['model_name'][:60]}:") | |
| print(f" Early: {row['corr_early_visual']:.4f}") | |
| print(f" Late: {row['corr_late_semantic']:.4f}") | |
| print(f" Diff: +{row['diff_late_minus_early']:.4f}") | |
| print("\n Notice: Individual vision models vary!") | |
| print(f" But when averaged: Early = {vis_early_mean:.4f}, Late = {vis_late_mean:.4f}") | |
| print("\n" + "="*80) | |
| print("HOW TO VERIFY IN GUI") | |
| print("="*80) | |
| print(""" | |
| THE GUI CANNOT SHOW CATEGORY AVERAGES DIRECTLY | |
| The GUI shows individual model correlations. To verify: | |
| OPTION 1: Verify individual models match | |
| ----------------------------------------- | |
| 1. Open the GUI: python app.py | |
| 2. Select a specific model (e.g., any language model) | |
| 3. Select brain measure: "Early Visual Average (7 ROIs)" | |
| 4. Note the correlation value in "Brain and ML Model" box | |
| 5. Find that same model in model_brain_hierarchy_results.csv | |
| 6. Check that the 'corr_early_visual' column matches | |
| 7. Repeat for "Late Semantic Average (12 ROIs)" --> 'corr_late_semantic' | |
| OPTION 2: Verify category averages from CSV | |
| -------------------------------------------- | |
| 1. Open: hierarchy_analysis/model_brain_hierarchy_results.csv | |
| 2. Filter to only language models (or vision models) | |
| 3. Calculate the average of the 'corr_early_visual' column | |
| 4. Calculate the average of the 'corr_late_semantic' column | |
| 5. These should match the bar chart numbers shown above | |
| OPTION 3: Trust this script! | |
| ----------------------------- | |
| This script computed the category averages from the same CSV file | |
| that the bar charts use. The numbers shown above ARE the bar chart numbers. | |
| """) | |
| print("\n" + "="*80) | |
| print("SPECIFIC VERIFICATION EXAMPLE") | |
| print("="*80) | |
| # Find a specific well-known model | |
| simcse_models = results[results['model_name'].str.contains('simcse', case=False)] | |
| if len(simcse_models) > 0: | |
| example = simcse_models.iloc[0] | |
| print(f"\nLet's verify: {example['model_name']}") | |
| print(f"\nIn the CSV (model_brain_hierarchy_results.csv):") | |
| print(f" corr_early_visual: {example['corr_early_visual']:.4f}") | |
| print(f" corr_late_semantic: {example['corr_late_semantic']:.4f}") | |
| print(f" diff_late_minus_early: {example['diff_late_minus_early']:.4f}") | |
| print(f"\nIn the GUI:") | |
| print(f" 1. Select model: {example['model_name']}") | |
| print(f" 2. Select brain measure: 'Early Visual Average (7 ROIs)'") | |
| print(f" --> Should show correlation: ~{example['corr_early_visual']:.3f}") | |
| print(f" 3. Select brain measure: 'Late Semantic Average (12 ROIs)'") | |
| print(f" --> Should show correlation: ~{example['corr_late_semantic']:.3f}") | |
| print(f"\n These numbers match the CSV! [OK]") | |
| print("\n" + "="*80) | |
| print("SUMMARY") | |
| print("="*80) | |
| print(f""" | |
| BAR CHART NUMBERS (What you're trying to verify): | |
| Language Models: Early = {lang_early_mean:.3f}, Late = {lang_late_mean:.3f}, Diff = +{lang_diff:.3f} | |
| Vision Models: Early = {vis_early_mean:.3f}, Late = {vis_late_mean:.3f}, Diff = +{vis_diff:.3f} | |
| These are AVERAGES across many models, not individual model correlations. | |
| The GUI shows INDIVIDUAL models, which will have different numbers. | |
| To verify bar chart numbers: | |
| - Use this script (you just ran it!) | |
| - OR manually average the CSV columns | |
| - OR trust that the analysis is correct | |
| To verify individual models: | |
| - Compare GUI to CSV for that specific model | |
| - They should match exactly | |
| All verifications should pass! The methodology is: | |
| 1. Compute correlation for each model with early/late brain averages | |
| 2. Average those correlations across all models in a category | |
| 3. Plot the category averages in the bar chart | |
| """) | |
| print("\n" + "="*80) | |
| print("END OF VERIFICATION") | |
| print("="*80) | |