| # π START HERE β Spatial-BEATs Documentation Guide |
|
|
| **Welcome!** You have access to comprehensive analysis of the Spatial-BEATs codebase. This guide will direct you to exactly what you need. |
|
|
| --- |
|
|
| ## β‘ Quick Pick Your Task |
|
|
| ### "I have 5 minutes" |
| β Read: [`SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md`](SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md) |
|
|
| ### "I have 15 minutes" |
| β Read: [`README_DOCUMENTATION_INDEX.md`](README_DOCUMENTATION_INDEX.md) then [`ANALYSIS_COMPLETION_SUMMARY.md`](ANALYSIS_COMPLETION_SUMMARY.md) |
|
|
| ### "I have 30 minutes" |
| β Choose one: |
| - **New to codebase?** Read: Part 1 of [`SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md`](SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md) |
| - **Debugging DOA gap?** Read: [`doa_train_valid_gap_analysis.md`](doa_train_valid_gap_analysis.md) Executive Summary |
| - **Planning experiments?** Read: [`0427_v11_series.md`](0427_v11_series.md) Section 1-2 |
|
|
| ### "I have 1-2 hours" |
| β Full reading path for your role: |
| - **Researcher**: QUICK_REF β 0427_v11_series.md β ANALYSIS Part 2-3 |
| - **Contributor**: QUICK_REF β ANALYSIS Part 1-2 β Pick component β read code |
| - **Investigator**: DOA_GAP Executive β Part 6 β Part 8 β Appendix |
| |
| --- |
| |
| ## π The Five Documents |
| |
| ### 1. **README_DOCUMENTATION_INDEX.md** |
| π **Navigation hub** β Where to find what |
| - Use case lookup (choose your problem) |
| - Code component quick reference |
| - Reading order for different roles |
| - Cross-reference guide |
| |
| **π Read this first if**: You're not sure where to start |
| |
| --- |
| |
| ### 2. **SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md** |
| β‘ **Practitioner's card** β Fast lookup |
| - Framework table (4 frameworks, 1 page) |
| - Route A/B/C comparison |
| - Version series highlights |
| - Code locations by component |
| - Loss weight patterns |
| - When to use each configuration |
| |
| **π Read this for**: Quick answers, practitioner reference |
| |
| --- |
| |
| ### 3. **SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md** |
| π **Deep technical reference** β Architecture bible |
| - Part 1: Four spatial frameworks (Spatial-AST, DCASE SELD, EINV2, DETR) |
| - Part 2: Routes A/B/C with full specifications |
| - Part 3: Version evolution (v7βv11) |
| - Part 4-10: Implementation, configs, metrics, future work |
| - Appendix: Code reference table |
| |
| **π Read this for**: Deep understanding, architecture details, code paths |
| |
| --- |
| |
| ### 4. **doa_train_valid_gap_analysis.md** |
| π **Diagnostic & fix guide** β Root cause analysis |
| - Executive Summary: 6 critical mechanisms |
| - Part 1: Data pipeline analysis |
| - Part 2: Loss computation asymmetry |
| - Part 3: Training configuration (v9/v10) |
| - Part 4: Validation metrics |
| - **Part 6: Root causes ranked by severity** |
| - **Part 7: Diagnostic checklist** |
| - **Part 8: Recommended fixes (prioritized)** |
| - Appendix: Code reference with exact line numbers |
| |
| **π Read this for**: Debugging train/val gaps, understanding root causes |
| |
| --- |
| |
| ### 5. **ANALYSIS_COMPLETION_SUMMARY.md** |
| π **Executive overview** β What was found |
| - Deliverables summary (5 docs, 1,883 lines) |
| - Key findings (frameworks, routes, v11 series) |
| - Next steps (immediate vs experimental) |
| - How to use documents |
| - Verification checklist |
| |
| **π Read this for**: Overview, decision-making, what comes next |
| |
| --- |
| |
| ## π― Choose Your Path |
| |
| ### Path 1: "I want to understand the architecture (60 min)" |
| 1. SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md (5 min) |
| 2. SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md Part 1-2 (25 min) |
| 3. Pick a component from Part 6 Appendix, find in code (20 min) |
| 4. spatial_beats_ov123_frame_routes.md if curious (10 min) |
| |
| **Outcome**: Can navigate codebase, understand paradigms, modify code confidently |
| |
| --- |
| |
| ### Path 2: "I need to debug a train/val gap (30 min)" |
| 1. doa_train_valid_gap_analysis.md Executive Summary (2 min) |
| 2. Part 6: Check which mechanisms apply to your situation (5 min) |
| 3. Part 7: Diagnosticsβcheck your logs (10 min) |
| 4. Part 8: Pick a fix priority (5 min) |
| 5. Appendix: Get code locations (2 min) |
| 6. SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md if you need to modify (optional) |
|
|
| **Outcome**: Root cause identified, fix strategy chosen, code locations ready |
|
|
| --- |
|
|
| ### Path 3: "I want to run an experiment (v11 series) (45 min)" |
| 1. SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md (5 min) |
| 2. 0427_v11_series.md Section 1-2 (15 min) |
| 3. SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md Part 3 (15 min) |
| 4. 0427_v11_series.md Part 4 (verification method) (5 min) |
| 5. Copy shell script from QUICK_REF (5 min) |
| |
| **Outcome**: Experiment ready to launch, understanding of success metrics |
| |
| --- |
| |
| ### Path 4: "I'm new, I want to understand everything (2 hours)" |
| 1. SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md (10 min) |
| 2. SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md Part 1-3 (40 min) |
| 3. spatial_beats_ov123_frame_routes.md (25 min) |
| 4. spatial_beats_training_overview.md (20 min) |
| 5. Pick component, trace through code with Part 6 references (20 min) |
| 6. doa_train_valid_gap_analysis.md Part 6 for context (5 min) |
|
|
| **Outcome**: Comprehensive understanding of system, ready to contribute |
|
|
| --- |
|
|
| ## π Key Findings at a Glance |
|
|
| ### 4 Spatial Frameworks in Codebase |
| - **Spatial-AST**: Task tokens (pre-trunk) |
| - **DCASE SELD**: Per-class activity+DOA |
| - **EINV2**: Learnable track queries |
| - **DETR**: Per-frame K-slot allocation |
|
|
| ### 3 Parallel Routes (A/B/C) |
| - **Route A**: Per-frame K-slot, per-step Hungarian |
| - **Route B**: Learnable queries, clip-level Hungarian (PRODUCTION v9) |
| - **Route C**: Per-class vectors (PROTOTYPE, v11c test) |
|
|
| ### DOA Train/Val Gap Root Causes |
| 1. β οΈβ οΈβ οΈ **ZERO spatial augmentation (rotations)** β 40-60% of variance |
| 2. β οΈβ οΈ **SpecAugment train-only** β 10-20% variance |
| 3. β οΈβ οΈ **v10 freezes direction head** β 30-40% on multi-source |
| 4. β οΈ **Regression sensitivity** β 5-15% variance |
| 5. β οΈ **Detached prediction asymmetry** β 2-5% variance |
|
|
| ### v11 Experiments (Parallel Runs) |
| - **v11a**: DOA demixer β ov2 angles β 5pp+ |
| - **v11b**: LocalSpatial pre-pool β test IV necessity |
| - **v11c**: ACCDOA paradigm β ov3 binding β 5pp+ |
| - **v11d**: Post-hoc calibration β ov1 ranking β 5pp+ |
|
|
| --- |
|
|
| ## π FAQ |
|
|
| **Q: Where do I find the direction head loss?** |
| A: `SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md` Appendix β search "direction loss" β `spatial_loss.py:1562-1565` |
|
|
| **Q: What's the difference between routes?** |
| A: Compare table in `SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md` or detailed Part 2 of `SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md` |
|
|
| **Q: Should I implement fix #1, #2, or #3?** |
| A: Read `doa_train_valid_gap_analysis.md` Part 6, pick based on your gap size and risk tolerance. |
|
|
| **Q: How do I run v11a?** |
| A: Shell script in `SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md` v11 section + spec in `0427_v11_series.md` Section 2.2 |
|
|
| **Q: I'm stuck on a component. Where's the code?** |
| A: `SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md` Part 6 has complete reference table with file:line for every component. |
|
|
| --- |
|
|
| ## π You Now Have |
|
|
| β
**Navigation guide** for all documents |
| β
**Quick reference card** with all the essentials |
| β
**Architecture bible** with code paths |
| β
**Diagnostic guide** for train/val gaps |
| β
**Experimental specifications** for v11 series |
| β
**Comprehensive metadata** (1,883 lines, 77KB) |
| β
**All findings tied to exact code locations** |
|
|
| --- |
|
|
| ## π Next Steps |
|
|
| 1. **Choose your path above** based on how much time you have |
| 2. **Follow the reading order** in that path |
| 3. **Use cross-references** when you need more detail |
| 4. **Check Appendices** for exact code locations |
| 5. **Reference Part 6/Part 8** when implementing |
|
|
| --- |
|
|
| ## π Document Overview |
|
|
| | Document | Size | Time | Purpose | |
| |----------|------|------|---------| |
| | README_DOCUMENTATION_INDEX | 12KB | 5-10m | Navigation hub | |
| | SPATIAL_FRAMEWORKS_QUICK_REFERENCE | 7KB | 5-10m | Quick lookup | |
| | SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS | 28KB | 30-45m | Deep reference | |
| | doa_train_valid_gap_analysis | 19KB | 20-30m | Diagnostics | |
| | ANALYSIS_COMPLETION_SUMMARY | 11KB | 10m | Executive summary | |
| | **TOTAL** | **77KB** | **2-4 hours** | **Complete set** | |
|
|
| --- |
|
|
| **Status**: β
Complete and ready for use |
| **Created**: 2026-04-27 |
| **Next update**: After v11 experiments |
|
|
| π **Pick your path above and start reading!** |
|
|