π START HERE β Spatial-BEATs Documentation Guide
Welcome! You have access to comprehensive analysis of the Spatial-BEATs codebase. This guide will direct you to exactly what you need.
β‘ Quick Pick Your Task
"I have 5 minutes"
β Read: SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md
"I have 15 minutes"
β Read: README_DOCUMENTATION_INDEX.md then ANALYSIS_COMPLETION_SUMMARY.md
"I have 30 minutes"
β Choose one:
- New to codebase? Read: Part 1 of
SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md - Debugging DOA gap? Read:
doa_train_valid_gap_analysis.mdExecutive Summary - Planning experiments? Read:
0427_v11_series.mdSection 1-2
"I have 1-2 hours"
β Full reading path for your role:
- Researcher: QUICK_REF β 0427_v11_series.md β ANALYSIS Part 2-3
- Contributor: QUICK_REF β ANALYSIS Part 1-2 β Pick component β read code
- Investigator: DOA_GAP Executive β Part 6 β Part 8 β Appendix
π The Five Documents
1. README_DOCUMENTATION_INDEX.md
π Navigation hub β Where to find what
- Use case lookup (choose your problem)
- Code component quick reference
- Reading order for different roles
- Cross-reference guide
π Read this first if: You're not sure where to start
2. SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md
β‘ Practitioner's card β Fast lookup
- Framework table (4 frameworks, 1 page)
- Route A/B/C comparison
- Version series highlights
- Code locations by component
- Loss weight patterns
- When to use each configuration
π Read this for: Quick answers, practitioner reference
3. SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md
π Deep technical reference β Architecture bible
- Part 1: Four spatial frameworks (Spatial-AST, DCASE SELD, EINV2, DETR)
- Part 2: Routes A/B/C with full specifications
- Part 3: Version evolution (v7βv11)
- Part 4-10: Implementation, configs, metrics, future work
- Appendix: Code reference table
π Read this for: Deep understanding, architecture details, code paths
4. doa_train_valid_gap_analysis.md
π Diagnostic & fix guide β Root cause analysis
- Executive Summary: 6 critical mechanisms
- Part 1: Data pipeline analysis
- Part 2: Loss computation asymmetry
- Part 3: Training configuration (v9/v10)
- Part 4: Validation metrics
- Part 6: Root causes ranked by severity
- Part 7: Diagnostic checklist
- Part 8: Recommended fixes (prioritized)
- Appendix: Code reference with exact line numbers
π Read this for: Debugging train/val gaps, understanding root causes
5. ANALYSIS_COMPLETION_SUMMARY.md
π Executive overview β What was found
- Deliverables summary (5 docs, 1,883 lines)
- Key findings (frameworks, routes, v11 series)
- Next steps (immediate vs experimental)
- How to use documents
- Verification checklist
π Read this for: Overview, decision-making, what comes next
π― Choose Your Path
Path 1: "I want to understand the architecture (60 min)"
- SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md (5 min)
- SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md Part 1-2 (25 min)
- Pick a component from Part 6 Appendix, find in code (20 min)
- spatial_beats_ov123_frame_routes.md if curious (10 min)
Outcome: Can navigate codebase, understand paradigms, modify code confidently
Path 2: "I need to debug a train/val gap (30 min)"
- doa_train_valid_gap_analysis.md Executive Summary (2 min)
- Part 6: Check which mechanisms apply to your situation (5 min)
- Part 7: Diagnosticsβcheck your logs (10 min)
- Part 8: Pick a fix priority (5 min)
- Appendix: Get code locations (2 min)
- SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md if you need to modify (optional)
Outcome: Root cause identified, fix strategy chosen, code locations ready
Path 3: "I want to run an experiment (v11 series) (45 min)"
- SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md (5 min)
- 0427_v11_series.md Section 1-2 (15 min)
- SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md Part 3 (15 min)
- 0427_v11_series.md Part 4 (verification method) (5 min)
- Copy shell script from QUICK_REF (5 min)
Outcome: Experiment ready to launch, understanding of success metrics
Path 4: "I'm new, I want to understand everything (2 hours)"
- SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md (10 min)
- SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md Part 1-3 (40 min)
- spatial_beats_ov123_frame_routes.md (25 min)
- spatial_beats_training_overview.md (20 min)
- Pick component, trace through code with Part 6 references (20 min)
- doa_train_valid_gap_analysis.md Part 6 for context (5 min)
Outcome: Comprehensive understanding of system, ready to contribute
π Key Findings at a Glance
4 Spatial Frameworks in Codebase
- Spatial-AST: Task tokens (pre-trunk)
- DCASE SELD: Per-class activity+DOA
- EINV2: Learnable track queries
- DETR: Per-frame K-slot allocation
3 Parallel Routes (A/B/C)
- Route A: Per-frame K-slot, per-step Hungarian
- Route B: Learnable queries, clip-level Hungarian (PRODUCTION v9)
- Route C: Per-class vectors (PROTOTYPE, v11c test)
DOA Train/Val Gap Root Causes
- β οΈβ οΈβ οΈ ZERO spatial augmentation (rotations) β 40-60% of variance
- β οΈβ οΈ SpecAugment train-only β 10-20% variance
- β οΈβ οΈ v10 freezes direction head β 30-40% on multi-source
- β οΈ Regression sensitivity β 5-15% variance
- β οΈ Detached prediction asymmetry β 2-5% variance
v11 Experiments (Parallel Runs)
- v11a: DOA demixer β ov2 angles β 5pp+
- v11b: LocalSpatial pre-pool β test IV necessity
- v11c: ACCDOA paradigm β ov3 binding β 5pp+
- v11d: Post-hoc calibration β ov1 ranking β 5pp+
π FAQ
Q: Where do I find the direction head loss?
A: SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md Appendix β search "direction loss" β spatial_loss.py:1562-1565
Q: What's the difference between routes?
A: Compare table in SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md or detailed Part 2 of SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md
Q: Should I implement fix #1, #2, or #3?
A: Read doa_train_valid_gap_analysis.md Part 6, pick based on your gap size and risk tolerance.
Q: How do I run v11a?
A: Shell script in SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md v11 section + spec in 0427_v11_series.md Section 2.2
Q: I'm stuck on a component. Where's the code?
A: SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md Part 6 has complete reference table with file:line for every component.
π You Now Have
β Navigation guide for all documents β Quick reference card with all the essentials β Architecture bible with code paths β Diagnostic guide for train/val gaps β Experimental specifications for v11 series β Comprehensive metadata (1,883 lines, 77KB) β All findings tied to exact code locations
π Next Steps
- Choose your path above based on how much time you have
- Follow the reading order in that path
- Use cross-references when you need more detail
- Check Appendices for exact code locations
- Reference Part 6/Part 8 when implementing
π Document Overview
| Document | Size | Time | Purpose |
|---|---|---|---|
| README_DOCUMENTATION_INDEX | 12KB | 5-10m | Navigation hub |
| SPATIAL_FRAMEWORKS_QUICK_REFERENCE | 7KB | 5-10m | Quick lookup |
| SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS | 28KB | 30-45m | Deep reference |
| doa_train_valid_gap_analysis | 19KB | 20-30m | Diagnostics |
| ANALYSIS_COMPLETION_SUMMARY | 11KB | 10m | Executive summary |
| TOTAL | 77KB | 2-4 hours | Complete set |
Status: β Complete and ready for use Created: 2026-04-27 Next update: After v11 experiments
π Pick your path above and start reading!