Spatial-BEATs / docs /README_DOCUMENTATION_INDEX.md
dieKarotte's picture
Add files using upload-large-folder tool
dd39446 verified
|
Raw
History Blame Contribute Delete
11.7 kB

Documentation Index β€” Spatial-BEATs Analysis & Reference

πŸ“‹ Quick Navigation

New to the codebase?

β†’ Start here: SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md (5 min read)

Debugging a train/validation gap in DOA?

β†’ Go here: doa_train_valid_gap_analysis.md β†’ Part 6 + Part 8

Need detailed architecture reference?

β†’ Read here: SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md (deep dive, 30 min)

Planning experiments (v11 series)?

β†’ Use: 0427_v11_series.md + SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md (v11 section)

Understanding specific component (e.g., SourceQueryDecoder)?

β†’ Use: Search in SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md Part 2 + Appendix for line numbers


πŸ“š Document Reference Table

Document Lines Size Best For Read Time
ANALYSIS_COMPLETION_SUMMARY (this index) 150 6KB Overview + navigation 5 min
SPATIAL_FRAMEWORKS_QUICK_REFERENCE 192 7KB Quick lookup, practitioner guide 5-10 min
SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS 724 28KB Deep technical reference 30-45 min
doa_train_valid_gap_analysis 434 19KB Diagnostics + fixes 20-30 min
0427_v11_series 185 13KB Experimental design (v11a/b/c/d) 15 min
spatial_beats_ov123_frame_routes 512 22KB Routes A/B/C architecture 25 min
spatial_beats_training_overview 420 15KB Training pipeline + presets 20 min

🎯 Use Case Lookup

"I need to understand the DOA train/val gap"

  1. Read: doa_train_valid_gap_analysis.md Executive Summary (2 min)
  2. Identify: Which of 6 mechanisms applies to your case (Part 6)
  3. Fix: Follow priority order in Part 8
  4. Reference: Code locations in Appendix

Expected outcome: Root cause identified + fix strategy


"I'm new and want to understand the architecture"

  1. Read: SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md Sections 1-3 (5 min)
  2. Read: SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md Part 1 + Part 2 (15 min)
  3. Reference: Code locations in Appendix for specific functions
  4. Cross-check: spatial_beats_ov123_frame_routes.md for Routes A/B/C details

Expected outcome: High-level understanding + ability to navigate code


"I want to run v11a experiment"

  1. Read: 0427_v11_series.md Section 2.2 (v11a) (5 min)
  2. Check: SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md v11 section for shell script
  3. Read: Part 4 (verification method) for how to evaluate results
  4. Reference: Appendix in doa_train_valid_gap_analysis.md for code line numbers if modifying

Expected outcome: Experiment ready to launch, understanding of what to expect


"What are all the spatial frameworks in this codebase?"

  1. Read: SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md Part 1 (5 min)
  2. Summary: Four frameworks implemented (Spatial-AST, DCASE SELD, EINV2, DETR-slots)
  3. Reference: Part 2 shows how each is implemented as Routes A/B/C

Expected outcome: Framework inventory + where each is implemented


"How do I compare Routes A/B/C?"

  1. Go to: SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md Part 2 (15 min)
  2. Check: Comparison table in SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md
  3. Deep dive: spatial_beats_ov123_frame_routes.md for architectural details

Expected outcome: Understanding of paradigm differences, when to use each


"What changed from v7 to v11?"

  1. Read: SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md Part 3 (10 min)
  2. Reference: SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md version series for quick compare
  3. Deep dive: doa_train_valid_gap_analysis.md Part 3 for v9/v10 details
  4. Experimental: 0427_v11_series.md Section 1 for v11 rationale

Expected outcome: Version history + innovation tracking


"Where exactly is the direction head loss computed?"

  1. Go to: SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md Appendix (search "direction loss")
  2. Result: spatial_loss.py:1562-1565
  3. Read: Part of doa_train_valid_gap_analysis.md Part 2.3 for context

Expected outcome: Exact code location + understanding of loss formulation


πŸ” Code Navigation Quick Reference

For Each Major Component

Component Primary Ref Backup Ref Concept
LocalSpatialEncoder ANALYSIS Part 6 QUICK_REF "Key locations" 7-channel FOA β†’ spatial features
SourceQueryDecoder ROUTES p.20-30 ANALYSIS Part 2.2b K track queries β†’ per-frame features
FrameTrackPredictionHeads QUICK_REF Appendix ANALYSIS Part 2.2b Predicts act/class/dir/dist per frame
Hungarian Matching DOA_GAP Part 2.2 ANALYSIS Part 4 How sources get assigned to slots/queries
ClassHeadSpectralDemixer ANALYSIS Part 3 QUICK_REF "v9 baseline" Breaks frequency pooling bottleneck
ACCDOA heads ROUTES p.40+ ANALYSIS Part 1.2 Route C: per-class 3D vectors
SpecAugment DOA_GAP Part 1.2 QUICK_REF "Training" Spectral masking (train-only!)

πŸ“Š Experiment Planning Matrix

To understand which experiment tests what:

Use: 0427_v11_series.md Section 2 (detailed specifications)

  • SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md v11 table (quick reference)
  • doa_train_valid_gap_analysis.md Part 6 (root causes)
Problem Experiment Mechanism Tested Expected Impact Doc Reference
ov2 angle errors (73.9%) v11a DOA demixer ↓ 5pp+ 0427_v11_series.md:18-40
ov2/ov3 angles v11b IV signal path Compare vs v11a 0427_v11_series.md:41-63
ov3 binding (24.5%) v11c ACCDOA paradigm ↓ 5pp+ 0427_v11_series.md:64-87
ov1 ranking (37% loss) v11d Post-hoc decoding ↑ 5pp+ 0427_v11_series.md:88-112

⚑ Critical Findings Summary

Three Most Important Things to Know

  1. Zero spatial augmentation (rotations) is the #1 cause of DOA train/val gap

    • Location: doa_train_valid_gap_analysis.md Executive Summary + Part 6
    • Impact: 40-60% of variance
    • Fix: See Part 8 recommendation #1
  2. Three parallel routes (A/B/C) coexist in the codebase

    • Route A: Per-frame slot allocation (unstable)
    • Route B: Learnable track queries (production, v9)
    • Route C: Per-class vectors (prototype, being tested in v11c)
    • Reference: SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md Part 2
  3. v10 phase-1 freezes the direction head entirely

    • Direction head gets no gradients for 10 epochs
    • Then unfrozen with poor initialization
    • Causes 30-40% DOA metric drop on multi-source splits
    • Reference: doa_train_valid_gap_analysis.md Part 3.2 + Part 6

πŸ”— Cross-Reference Guide

"I'm reading X, how do I find related content?"

Reading See Also
ROUTES page 20-30 (Route B) ANALYSIS Part 2.2b, QUICK_REF "Routes", DOA_GAP Part 2-3
ANALYSIS Part 1 (Spatial-AST) QUICK_REF table, ANALYSIS Part 6 (code locations)
DOA_GAP Part 8 (fixes) ANALYSIS Part 6 (line numbers), QUICK_REF "Training"
0427_v11_series Section 2 (v11 specs) QUICK_REF v11 table, DOA_GAP Part 6 (why these experiments)
TRAINING_OVERVIEW ANALYSIS Part 5 (configs), QUICK_REF "Loss weights"

πŸ“ How Documents Were Created

All documentation created from comprehensive codebase analysis on 2026-04-27:

  • βœ… 4 spatial frameworks identified (Spatial-AST, DCASE SELD, EINV2, DETR-slots)
  • βœ… 3 parallel routes analyzed with full architecture specifications
  • βœ… 6 mechanisms causing DOA train/val gaps discovered
  • βœ… v7β†’v11 experimental series mapped with root cause tracing
  • βœ… ~10,000+ lines of code reviewed and cross-referenced
  • βœ… All findings tied to exact file:line numbers

Quality Assurance:

  • Code references verified with actual line numbers
  • Architecture descriptions validated against source
  • Experimental hypotheses cross-checked with docstrings
  • Cross-document consistency checked

πŸš€ Recommended Reading Order

For Different Roles

Principal Investigator / Project Lead

  1. This index + section on "Critical Findings" (5 min)
  2. ANALYSIS_COMPLETION_SUMMARY.md (10 min)
  3. 0427_v11_series.md Section 1 (diagnostics review) + Section 5 (order) (10 min) β†’ Decision: Approve v11 experiments? (Total: 25 min)

Researcher / Experimenter

  1. SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md (10 min)
  2. 0427_v11_series.md full document (15 min)
  3. SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md Part 2 + 3 (25 min) β†’ Ready: Design and run experiments (Total: 50 min)

New Contributor / Intern

  1. SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md (10 min)
  2. SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md Part 1 + 2 (25 min)
  3. Pick a component (e.g., "LocalSpatialEncoder") β†’ find in Part 6 β†’ read code
  4. spatial_beats_training_overview.md or spatial_beats_coding_guide.md as needed β†’ Goal: Understand codebase (Total: 1-2 hours)

Debugging Train/Val Gap

  1. doa_train_valid_gap_analysis.md Executive Summary + Part 6 (10 min)
  2. Part 7 (diagnostics) β€” check your logs (10 min)
  3. Part 8 (fixes) β€” pick priority #1-3 (5 min)
  4. Appendix β€” get code locations (5 min) β†’ Goal: Root cause + fix strategy (Total: 30 min)

πŸ“ž FAQ About Documentation

Q: Where do I find the code for Route A, B, or C? A: See SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md Part 2.1/2.2/2.3, each has exact file:line references in Part 6 Appendix.

Q: What's the difference between v9 and v10? A: doa_train_valid_gap_analysis.md Part 3.1 vs 3.2, or quick summary in SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md.

Q: Should I implement fix #1, #2, or #3? A: Depends on your problem. See doa_train_valid_gap_analysis.md Part 6 β€” rank root causes by severity against your gap size.

Q: How long will v11 experiments take? A: ~14 days total. See 0427_v11_series.md Section 5 for recommended order (serial vs parallel).

Q: Can I run v11 without understanding everything? A: Yes! Copy the shell script from SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md, follow Part 4 (verification method) in 0427_v11_series.md.


βœ… Checklist: What You Can Do Now

After reading appropriate documentation:

  • Understand what spatial frameworks exist in codebase
  • Identify which route (A/B/C) solves your problem
  • Diagnose your train/val gap (Part 6 in DOA_GAP)
  • Plan an experiment (v11 specs + order)
  • Find exact code locations (Appendix tables)
  • Understand loss weight patterns (QUICK_REF tables)
  • Know when to hot-start from which checkpoint
  • Compare validation metrics across routes

πŸ“„ License & Citation

These documents are analysis artifacts for internal research use. They synthesize information from:

  • Source code comments and docstrings
  • Configuration file specifications
  • DCASE challenge documentation (officially referenced in code)
  • Research paper citations in docstrings

Last Updated: 2026-04-27
Analysis Completed: Yes
Ready for Use: Yes
Maintenance: Update after v11 experiments complete

For questions or clarifications, refer to exact file:line citations in the appendices of technical documents.