# Documentation Index — Spatial-BEATs Analysis & Reference ## 📋 Quick Navigation ### **New to the codebase?** → Start here: [`SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md`](SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md) (5 min read) ### **Debugging a train/validation gap in DOA?** → Go here: [`doa_train_valid_gap_analysis.md`](doa_train_valid_gap_analysis.md) → Part 6 + Part 8 ### **Need detailed architecture reference?** → Read here: [`SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md`](SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md) (deep dive, 30 min) ### **Planning experiments (v11 series)?** → Use: [`0427_v11_series.md`](0427_v11_series.md) + `SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md` (v11 section) ### **Understanding specific component (e.g., SourceQueryDecoder)?** → Use: Search in `SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md` Part 2 + Appendix for line numbers --- ## 📚 Document Reference Table | Document | Lines | Size | Best For | Read Time | |----------|-------|------|----------|-----------| | **ANALYSIS_COMPLETION_SUMMARY** (this index) | 150 | 6KB | Overview + navigation | 5 min | | **SPATIAL_FRAMEWORKS_QUICK_REFERENCE** | 192 | 7KB | Quick lookup, practitioner guide | 5-10 min | | **SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS** | 724 | 28KB | Deep technical reference | 30-45 min | | **doa_train_valid_gap_analysis** | 434 | 19KB | Diagnostics + fixes | 20-30 min | | **0427_v11_series** | 185 | 13KB | Experimental design (v11a/b/c/d) | 15 min | | **spatial_beats_ov123_frame_routes** | 512 | 22KB | Routes A/B/C architecture | 25 min | | **spatial_beats_training_overview** | 420 | 15KB | Training pipeline + presets | 20 min | --- ## 🎯 Use Case Lookup ### "I need to understand the DOA train/val gap" 1. Read: [`doa_train_valid_gap_analysis.md`](doa_train_valid_gap_analysis.md) **Executive Summary** (2 min) 2. Identify: Which of 6 mechanisms applies to your case (Part 6) 3. Fix: Follow priority order in Part 8 4. Reference: Code locations in Appendix **Expected outcome**: Root cause identified + fix strategy --- ### "I'm new and want to understand the architecture" 1. Read: [`SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md`](SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md) **Sections 1-3** (5 min) 2. Read: [`SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md`](SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md) **Part 1 + Part 2** (15 min) 3. Reference: Code locations in Appendix for specific functions 4. Cross-check: `spatial_beats_ov123_frame_routes.md` for Routes A/B/C details **Expected outcome**: High-level understanding + ability to navigate code --- ### "I want to run v11a experiment" 1. Read: [`0427_v11_series.md`](0427_v11_series.md) **Section 2.2 (v11a)** (5 min) 2. Check: `SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md` **v11 section** for shell script 3. Read: Part 4 (verification method) for how to evaluate results 4. Reference: Appendix in `doa_train_valid_gap_analysis.md` for code line numbers if modifying **Expected outcome**: Experiment ready to launch, understanding of what to expect --- ### "What are all the spatial frameworks in this codebase?" 1. Read: [`SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md`](SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md) **Part 1** (5 min) 2. Summary: Four frameworks implemented (Spatial-AST, DCASE SELD, EINV2, DETR-slots) 3. Reference: Part 2 shows how each is implemented as Routes A/B/C **Expected outcome**: Framework inventory + where each is implemented --- ### "How do I compare Routes A/B/C?" 1. Go to: [`SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md`](SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md) **Part 2** (15 min) 2. Check: Comparison table in `SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md` 3. Deep dive: `spatial_beats_ov123_frame_routes.md` for architectural details **Expected outcome**: Understanding of paradigm differences, when to use each --- ### "What changed from v7 to v11?" 1. Read: [`SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md`](SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md) **Part 3** (10 min) 2. Reference: `SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md` **version series** for quick compare 3. Deep dive: `doa_train_valid_gap_analysis.md` **Part 3** for v9/v10 details 4. Experimental: `0427_v11_series.md` **Section 1** for v11 rationale **Expected outcome**: Version history + innovation tracking --- ### "Where exactly is the direction head loss computed?" 1. Go to: `SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md` **Appendix** (search "direction loss") 2. Result: `spatial_loss.py:1562-1565` 3. Read: Part of `doa_train_valid_gap_analysis.md` **Part 2.3** for context **Expected outcome**: Exact code location + understanding of loss formulation --- ## 🔍 Code Navigation Quick Reference ### For Each Major Component | Component | Primary Ref | Backup Ref | Concept | |-----------|------------|-----------|---------| | **LocalSpatialEncoder** | ANALYSIS Part 6 | QUICK_REF "Key locations" | 7-channel FOA → spatial features | | **SourceQueryDecoder** | ROUTES p.20-30 | ANALYSIS Part 2.2b | K track queries → per-frame features | | **FrameTrackPredictionHeads** | QUICK_REF Appendix | ANALYSIS Part 2.2b | Predicts act/class/dir/dist per frame | | **Hungarian Matching** | DOA_GAP Part 2.2 | ANALYSIS Part 4 | How sources get assigned to slots/queries | | **ClassHeadSpectralDemixer** | ANALYSIS Part 3 | QUICK_REF "v9 baseline" | Breaks frequency pooling bottleneck | | **ACCDOA heads** | ROUTES p.40+ | ANALYSIS Part 1.2 | Route C: per-class 3D vectors | | **SpecAugment** | DOA_GAP Part 1.2 | QUICK_REF "Training" | Spectral masking (train-only!) | --- ## 📊 Experiment Planning Matrix ### To understand **which experiment tests what**: Use: `0427_v11_series.md` Section 2 (detailed specifications) + `SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md` v11 table (quick reference) + `doa_train_valid_gap_analysis.md` Part 6 (root causes) | Problem | Experiment | Mechanism Tested | Expected Impact | Doc Reference | |---------|-----------|-----------------|-----------------|----------------| | ov2 angle errors (73.9%) | v11a | DOA demixer | ↓ 5pp+ | 0427_v11_series.md:18-40 | | ov2/ov3 angles | v11b | IV signal path | Compare vs v11a | 0427_v11_series.md:41-63 | | ov3 binding (24.5%) | v11c | ACCDOA paradigm | ↓ 5pp+ | 0427_v11_series.md:64-87 | | ov1 ranking (37% loss) | v11d | Post-hoc decoding | ↑ 5pp+ | 0427_v11_series.md:88-112 | --- ## ⚡ Critical Findings Summary ### Three Most Important Things to Know 1. **Zero spatial augmentation (rotations)** is the #1 cause of DOA train/val gap - Location: `doa_train_valid_gap_analysis.md` **Executive Summary** + Part 6 - Impact: 40-60% of variance - Fix: See Part 8 recommendation #1 2. **Three parallel routes (A/B/C) coexist in the codebase** - Route A: Per-frame slot allocation (unstable) - Route B: Learnable track queries (production, v9) - Route C: Per-class vectors (prototype, being tested in v11c) - Reference: `SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md` Part 2 3. **v10 phase-1 freezes the direction head entirely** - Direction head gets no gradients for 10 epochs - Then unfrozen with poor initialization - Causes 30-40% DOA metric drop on multi-source splits - Reference: `doa_train_valid_gap_analysis.md` Part 3.2 + Part 6 --- ## 🔗 Cross-Reference Guide ### "I'm reading X, how do I find related content?" | Reading | See Also | |---------|----------| | ROUTES page 20-30 (Route B) | ANALYSIS Part 2.2b, QUICK_REF "Routes", DOA_GAP Part 2-3 | | ANALYSIS Part 1 (Spatial-AST) | QUICK_REF table, ANALYSIS Part 6 (code locations) | | DOA_GAP Part 8 (fixes) | ANALYSIS Part 6 (line numbers), QUICK_REF "Training" | | 0427_v11_series Section 2 (v11 specs) | QUICK_REF v11 table, DOA_GAP Part 6 (why these experiments) | | TRAINING_OVERVIEW | ANALYSIS Part 5 (configs), QUICK_REF "Loss weights" | --- ## 📝 How Documents Were Created All documentation created from **comprehensive codebase analysis** on 2026-04-27: - ✅ 4 spatial frameworks identified (Spatial-AST, DCASE SELD, EINV2, DETR-slots) - ✅ 3 parallel routes analyzed with full architecture specifications - ✅ 6 mechanisms causing DOA train/val gaps discovered - ✅ v7→v11 experimental series mapped with root cause tracing - ✅ ~10,000+ lines of code reviewed and cross-referenced - ✅ All findings tied to exact file:line numbers **Quality Assurance**: - Code references verified with actual line numbers - Architecture descriptions validated against source - Experimental hypotheses cross-checked with docstrings - Cross-document consistency checked --- ## 🚀 Recommended Reading Order ### **For Different Roles** #### **Principal Investigator / Project Lead** 1. This index + section on "Critical Findings" (5 min) 2. `ANALYSIS_COMPLETION_SUMMARY.md` (10 min) 3. `0427_v11_series.md` Section 1 (diagnostics review) + Section 5 (order) (10 min) → **Decision**: Approve v11 experiments? (Total: 25 min) #### **Researcher / Experimenter** 1. `SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md` (10 min) 2. `0427_v11_series.md` full document (15 min) 3. `SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md` Part 2 + 3 (25 min) → **Ready**: Design and run experiments (Total: 50 min) #### **New Contributor / Intern** 1. `SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md` (10 min) 2. `SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md` Part 1 + 2 (25 min) 3. Pick a component (e.g., "LocalSpatialEncoder") → find in Part 6 → read code 4. `spatial_beats_training_overview.md` or `spatial_beats_coding_guide.md` as needed → **Goal**: Understand codebase (Total: 1-2 hours) #### **Debugging Train/Val Gap** 1. `doa_train_valid_gap_analysis.md` Executive Summary + Part 6 (10 min) 2. Part 7 (diagnostics) — check your logs (10 min) 3. Part 8 (fixes) — pick priority #1-3 (5 min) 4. Appendix — get code locations (5 min) → **Goal**: Root cause + fix strategy (Total: 30 min) --- ## 📞 FAQ About Documentation **Q: Where do I find the code for Route A, B, or C?** A: See `SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md` Part 2.1/2.2/2.3, each has exact file:line references in Part 6 Appendix. **Q: What's the difference between v9 and v10?** A: `doa_train_valid_gap_analysis.md` Part 3.1 vs 3.2, or quick summary in `SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md`. **Q: Should I implement fix #1, #2, or #3?** A: Depends on your problem. See `doa_train_valid_gap_analysis.md` Part 6 — rank root causes by severity against your gap size. **Q: How long will v11 experiments take?** A: ~14 days total. See `0427_v11_series.md` Section 5 for recommended order (serial vs parallel). **Q: Can I run v11 without understanding everything?** A: Yes! Copy the shell script from `SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md`, follow Part 4 (verification method) in `0427_v11_series.md`. --- ## ✅ Checklist: What You Can Do Now After reading appropriate documentation: - [ ] Understand what spatial frameworks exist in codebase - [ ] Identify which route (A/B/C) solves your problem - [ ] Diagnose your train/val gap (Part 6 in DOA_GAP) - [ ] Plan an experiment (v11 specs + order) - [ ] Find exact code locations (Appendix tables) - [ ] Understand loss weight patterns (QUICK_REF tables) - [ ] Know when to hot-start from which checkpoint - [ ] Compare validation metrics across routes --- ## 📄 License & Citation These documents are analysis artifacts for internal research use. They synthesize information from: - Source code comments and docstrings - Configuration file specifications - DCASE challenge documentation (officially referenced in code) - Research paper citations in docstrings --- **Last Updated**: 2026-04-27 **Analysis Completed**: Yes **Ready for Use**: Yes **Maintenance**: Update after v11 experiments complete For questions or clarifications, refer to exact file:line citations in the appendices of technical documents.