# πŸš€ START HERE β€” Spatial-BEATs Documentation Guide **Welcome!** You have access to comprehensive analysis of the Spatial-BEATs codebase. This guide will direct you to exactly what you need. --- ## ⚑ Quick Pick Your Task ### "I have 5 minutes" β†’ Read: [`SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md`](SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md) ### "I have 15 minutes" β†’ Read: [`README_DOCUMENTATION_INDEX.md`](README_DOCUMENTATION_INDEX.md) then [`ANALYSIS_COMPLETION_SUMMARY.md`](ANALYSIS_COMPLETION_SUMMARY.md) ### "I have 30 minutes" β†’ Choose one: - **New to codebase?** Read: Part 1 of [`SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md`](SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md) - **Debugging DOA gap?** Read: [`doa_train_valid_gap_analysis.md`](doa_train_valid_gap_analysis.md) Executive Summary - **Planning experiments?** Read: [`0427_v11_series.md`](0427_v11_series.md) Section 1-2 ### "I have 1-2 hours" β†’ Full reading path for your role: - **Researcher**: QUICK_REF β†’ 0427_v11_series.md β†’ ANALYSIS Part 2-3 - **Contributor**: QUICK_REF β†’ ANALYSIS Part 1-2 β†’ Pick component β†’ read code - **Investigator**: DOA_GAP Executive β†’ Part 6 β†’ Part 8 β†’ Appendix --- ## πŸ“š The Five Documents ### 1. **README_DOCUMENTATION_INDEX.md** 🏠 **Navigation hub** β€” Where to find what - Use case lookup (choose your problem) - Code component quick reference - Reading order for different roles - Cross-reference guide **πŸ‘‰ Read this first if**: You're not sure where to start --- ### 2. **SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md** ⚑ **Practitioner's card** β€” Fast lookup - Framework table (4 frameworks, 1 page) - Route A/B/C comparison - Version series highlights - Code locations by component - Loss weight patterns - When to use each configuration **πŸ‘‰ Read this for**: Quick answers, practitioner reference --- ### 3. **SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md** πŸ“– **Deep technical reference** β€” Architecture bible - Part 1: Four spatial frameworks (Spatial-AST, DCASE SELD, EINV2, DETR) - Part 2: Routes A/B/C with full specifications - Part 3: Version evolution (v7β†’v11) - Part 4-10: Implementation, configs, metrics, future work - Appendix: Code reference table **πŸ‘‰ Read this for**: Deep understanding, architecture details, code paths --- ### 4. **doa_train_valid_gap_analysis.md** πŸ” **Diagnostic & fix guide** β€” Root cause analysis - Executive Summary: 6 critical mechanisms - Part 1: Data pipeline analysis - Part 2: Loss computation asymmetry - Part 3: Training configuration (v9/v10) - Part 4: Validation metrics - **Part 6: Root causes ranked by severity** - **Part 7: Diagnostic checklist** - **Part 8: Recommended fixes (prioritized)** - Appendix: Code reference with exact line numbers **πŸ‘‰ Read this for**: Debugging train/val gaps, understanding root causes --- ### 5. **ANALYSIS_COMPLETION_SUMMARY.md** πŸ“‹ **Executive overview** β€” What was found - Deliverables summary (5 docs, 1,883 lines) - Key findings (frameworks, routes, v11 series) - Next steps (immediate vs experimental) - How to use documents - Verification checklist **πŸ‘‰ Read this for**: Overview, decision-making, what comes next --- ## 🎯 Choose Your Path ### Path 1: "I want to understand the architecture (60 min)" 1. SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md (5 min) 2. SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md Part 1-2 (25 min) 3. Pick a component from Part 6 Appendix, find in code (20 min) 4. spatial_beats_ov123_frame_routes.md if curious (10 min) **Outcome**: Can navigate codebase, understand paradigms, modify code confidently --- ### Path 2: "I need to debug a train/val gap (30 min)" 1. doa_train_valid_gap_analysis.md Executive Summary (2 min) 2. Part 6: Check which mechanisms apply to your situation (5 min) 3. Part 7: Diagnosticsβ€”check your logs (10 min) 4. Part 8: Pick a fix priority (5 min) 5. Appendix: Get code locations (2 min) 6. SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md if you need to modify (optional) **Outcome**: Root cause identified, fix strategy chosen, code locations ready --- ### Path 3: "I want to run an experiment (v11 series) (45 min)" 1. SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md (5 min) 2. 0427_v11_series.md Section 1-2 (15 min) 3. SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md Part 3 (15 min) 4. 0427_v11_series.md Part 4 (verification method) (5 min) 5. Copy shell script from QUICK_REF (5 min) **Outcome**: Experiment ready to launch, understanding of success metrics --- ### Path 4: "I'm new, I want to understand everything (2 hours)" 1. SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md (10 min) 2. SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md Part 1-3 (40 min) 3. spatial_beats_ov123_frame_routes.md (25 min) 4. spatial_beats_training_overview.md (20 min) 5. Pick component, trace through code with Part 6 references (20 min) 6. doa_train_valid_gap_analysis.md Part 6 for context (5 min) **Outcome**: Comprehensive understanding of system, ready to contribute --- ## πŸ”‘ Key Findings at a Glance ### 4 Spatial Frameworks in Codebase - **Spatial-AST**: Task tokens (pre-trunk) - **DCASE SELD**: Per-class activity+DOA - **EINV2**: Learnable track queries - **DETR**: Per-frame K-slot allocation ### 3 Parallel Routes (A/B/C) - **Route A**: Per-frame K-slot, per-step Hungarian - **Route B**: Learnable queries, clip-level Hungarian (PRODUCTION v9) - **Route C**: Per-class vectors (PROTOTYPE, v11c test) ### DOA Train/Val Gap Root Causes 1. ⚠️⚠️⚠️ **ZERO spatial augmentation (rotations)** β€” 40-60% of variance 2. ⚠️⚠️ **SpecAugment train-only** β€” 10-20% variance 3. ⚠️⚠️ **v10 freezes direction head** β€” 30-40% on multi-source 4. ⚠️ **Regression sensitivity** β€” 5-15% variance 5. ⚠️ **Detached prediction asymmetry** β€” 2-5% variance ### v11 Experiments (Parallel Runs) - **v11a**: DOA demixer β†’ ov2 angles ↓ 5pp+ - **v11b**: LocalSpatial pre-pool β†’ test IV necessity - **v11c**: ACCDOA paradigm β†’ ov3 binding ↓ 5pp+ - **v11d**: Post-hoc calibration β†’ ov1 ranking ↑ 5pp+ --- ## πŸ“ž FAQ **Q: Where do I find the direction head loss?** A: `SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md` Appendix β†’ search "direction loss" β†’ `spatial_loss.py:1562-1565` **Q: What's the difference between routes?** A: Compare table in `SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md` or detailed Part 2 of `SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md` **Q: Should I implement fix #1, #2, or #3?** A: Read `doa_train_valid_gap_analysis.md` Part 6, pick based on your gap size and risk tolerance. **Q: How do I run v11a?** A: Shell script in `SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md` v11 section + spec in `0427_v11_series.md` Section 2.2 **Q: I'm stuck on a component. Where's the code?** A: `SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md` Part 6 has complete reference table with file:line for every component. --- ## 🎁 You Now Have βœ… **Navigation guide** for all documents βœ… **Quick reference card** with all the essentials βœ… **Architecture bible** with code paths βœ… **Diagnostic guide** for train/val gaps βœ… **Experimental specifications** for v11 series βœ… **Comprehensive metadata** (1,883 lines, 77KB) βœ… **All findings tied to exact code locations** --- ## πŸš€ Next Steps 1. **Choose your path above** based on how much time you have 2. **Follow the reading order** in that path 3. **Use cross-references** when you need more detail 4. **Check Appendices** for exact code locations 5. **Reference Part 6/Part 8** when implementing --- ## πŸ“Š Document Overview | Document | Size | Time | Purpose | |----------|------|------|---------| | README_DOCUMENTATION_INDEX | 12KB | 5-10m | Navigation hub | | SPATIAL_FRAMEWORKS_QUICK_REFERENCE | 7KB | 5-10m | Quick lookup | | SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS | 28KB | 30-45m | Deep reference | | doa_train_valid_gap_analysis | 19KB | 20-30m | Diagnostics | | ANALYSIS_COMPLETION_SUMMARY | 11KB | 10m | Executive summary | | **TOTAL** | **77KB** | **2-4 hours** | **Complete set** | --- **Status**: βœ… Complete and ready for use **Created**: 2026-04-27 **Next update**: After v11 experiments πŸ‘‰ **Pick your path above and start reading!**