File size: 11,724 Bytes
dd39446
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
# Documentation Index β€” Spatial-BEATs Analysis & Reference

## πŸ“‹ Quick Navigation

### **New to the codebase?**
β†’ Start here: [`SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md`](SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md) (5 min read)

### **Debugging a train/validation gap in DOA?**
β†’ Go here: [`doa_train_valid_gap_analysis.md`](doa_train_valid_gap_analysis.md) β†’ Part 6 + Part 8

### **Need detailed architecture reference?**
β†’ Read here: [`SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md`](SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md) (deep dive, 30 min)

### **Planning experiments (v11 series)?**
β†’ Use: [`0427_v11_series.md`](0427_v11_series.md) + `SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md` (v11 section)

### **Understanding specific component (e.g., SourceQueryDecoder)?**
β†’ Use: Search in `SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md` Part 2 + Appendix for line numbers

---

## πŸ“š Document Reference Table

| Document | Lines | Size | Best For | Read Time |
|----------|-------|------|----------|-----------|
| **ANALYSIS_COMPLETION_SUMMARY** (this index) | 150 | 6KB | Overview + navigation | 5 min |
| **SPATIAL_FRAMEWORKS_QUICK_REFERENCE** | 192 | 7KB | Quick lookup, practitioner guide | 5-10 min |
| **SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS** | 724 | 28KB | Deep technical reference | 30-45 min |
| **doa_train_valid_gap_analysis** | 434 | 19KB | Diagnostics + fixes | 20-30 min |
| **0427_v11_series** | 185 | 13KB | Experimental design (v11a/b/c/d) | 15 min |
| **spatial_beats_ov123_frame_routes** | 512 | 22KB | Routes A/B/C architecture | 25 min |
| **spatial_beats_training_overview** | 420 | 15KB | Training pipeline + presets | 20 min |

---

## 🎯 Use Case Lookup

### "I need to understand the DOA train/val gap"
1. Read: [`doa_train_valid_gap_analysis.md`](doa_train_valid_gap_analysis.md) **Executive Summary** (2 min)
2. Identify: Which of 6 mechanisms applies to your case (Part 6)
3. Fix: Follow priority order in Part 8
4. Reference: Code locations in Appendix

**Expected outcome**: Root cause identified + fix strategy

---

### "I'm new and want to understand the architecture"
1. Read: [`SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md`](SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md) **Sections 1-3** (5 min)
2. Read: [`SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md`](SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md) **Part 1 + Part 2** (15 min)
3. Reference: Code locations in Appendix for specific functions
4. Cross-check: `spatial_beats_ov123_frame_routes.md` for Routes A/B/C details

**Expected outcome**: High-level understanding + ability to navigate code

---

### "I want to run v11a experiment"
1. Read: [`0427_v11_series.md`](0427_v11_series.md) **Section 2.2 (v11a)** (5 min)
2. Check: `SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md` **v11 section** for shell script
3. Read: Part 4 (verification method) for how to evaluate results
4. Reference: Appendix in `doa_train_valid_gap_analysis.md` for code line numbers if modifying

**Expected outcome**: Experiment ready to launch, understanding of what to expect

---

### "What are all the spatial frameworks in this codebase?"
1. Read: [`SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md`](SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md) **Part 1** (5 min)
2. Summary: Four frameworks implemented (Spatial-AST, DCASE SELD, EINV2, DETR-slots)
3. Reference: Part 2 shows how each is implemented as Routes A/B/C

**Expected outcome**: Framework inventory + where each is implemented

---

### "How do I compare Routes A/B/C?"
1. Go to: [`SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md`](SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md) **Part 2** (15 min)
2. Check: Comparison table in `SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md`
3. Deep dive: `spatial_beats_ov123_frame_routes.md` for architectural details

**Expected outcome**: Understanding of paradigm differences, when to use each

---

### "What changed from v7 to v11?"
1. Read: [`SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md`](SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md) **Part 3** (10 min)
2. Reference: `SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md` **version series** for quick compare
3. Deep dive: `doa_train_valid_gap_analysis.md` **Part 3** for v9/v10 details
4. Experimental: `0427_v11_series.md` **Section 1** for v11 rationale

**Expected outcome**: Version history + innovation tracking

---

### "Where exactly is the direction head loss computed?"
1. Go to: `SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md` **Appendix** (search "direction loss")
2. Result: `spatial_loss.py:1562-1565`
3. Read: Part of `doa_train_valid_gap_analysis.md` **Part 2.3** for context

**Expected outcome**: Exact code location + understanding of loss formulation

---

## πŸ” Code Navigation Quick Reference

### For Each Major Component

| Component | Primary Ref | Backup Ref | Concept |
|-----------|------------|-----------|---------|
| **LocalSpatialEncoder** | ANALYSIS Part 6 | QUICK_REF "Key locations" | 7-channel FOA β†’ spatial features |
| **SourceQueryDecoder** | ROUTES p.20-30 | ANALYSIS Part 2.2b | K track queries β†’ per-frame features |
| **FrameTrackPredictionHeads** | QUICK_REF Appendix | ANALYSIS Part 2.2b | Predicts act/class/dir/dist per frame |
| **Hungarian Matching** | DOA_GAP Part 2.2 | ANALYSIS Part 4 | How sources get assigned to slots/queries |
| **ClassHeadSpectralDemixer** | ANALYSIS Part 3 | QUICK_REF "v9 baseline" | Breaks frequency pooling bottleneck |
| **ACCDOA heads** | ROUTES p.40+ | ANALYSIS Part 1.2 | Route C: per-class 3D vectors |
| **SpecAugment** | DOA_GAP Part 1.2 | QUICK_REF "Training" | Spectral masking (train-only!) |

---

## πŸ“Š Experiment Planning Matrix

### To understand **which experiment tests what**:

Use: `0427_v11_series.md` Section 2 (detailed specifications)
+ `SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md` v11 table (quick reference)
+ `doa_train_valid_gap_analysis.md` Part 6 (root causes)

| Problem | Experiment | Mechanism Tested | Expected Impact | Doc Reference |
|---------|-----------|-----------------|-----------------|----------------|
| ov2 angle errors (73.9%) | v11a | DOA demixer | ↓ 5pp+ | 0427_v11_series.md:18-40 |
| ov2/ov3 angles | v11b | IV signal path | Compare vs v11a | 0427_v11_series.md:41-63 |
| ov3 binding (24.5%) | v11c | ACCDOA paradigm | ↓ 5pp+ | 0427_v11_series.md:64-87 |
| ov1 ranking (37% loss) | v11d | Post-hoc decoding | ↑ 5pp+ | 0427_v11_series.md:88-112 |

---

## ⚑ Critical Findings Summary

### Three Most Important Things to Know

1. **Zero spatial augmentation (rotations)** is the #1 cause of DOA train/val gap
   - Location: `doa_train_valid_gap_analysis.md` **Executive Summary** + Part 6
   - Impact: 40-60% of variance
   - Fix: See Part 8 recommendation #1

2. **Three parallel routes (A/B/C) coexist in the codebase**
   - Route A: Per-frame slot allocation (unstable)
   - Route B: Learnable track queries (production, v9)
   - Route C: Per-class vectors (prototype, being tested in v11c)
   - Reference: `SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md` Part 2

3. **v10 phase-1 freezes the direction head entirely**
   - Direction head gets no gradients for 10 epochs
   - Then unfrozen with poor initialization
   - Causes 30-40% DOA metric drop on multi-source splits
   - Reference: `doa_train_valid_gap_analysis.md` Part 3.2 + Part 6

---

## πŸ”— Cross-Reference Guide

### "I'm reading X, how do I find related content?"

| Reading | See Also |
|---------|----------|
| ROUTES page 20-30 (Route B) | ANALYSIS Part 2.2b, QUICK_REF "Routes", DOA_GAP Part 2-3 |
| ANALYSIS Part 1 (Spatial-AST) | QUICK_REF table, ANALYSIS Part 6 (code locations) |
| DOA_GAP Part 8 (fixes) | ANALYSIS Part 6 (line numbers), QUICK_REF "Training" |
| 0427_v11_series Section 2 (v11 specs) | QUICK_REF v11 table, DOA_GAP Part 6 (why these experiments) |
| TRAINING_OVERVIEW | ANALYSIS Part 5 (configs), QUICK_REF "Loss weights" |

---

## πŸ“ How Documents Were Created

All documentation created from **comprehensive codebase analysis** on 2026-04-27:
- βœ… 4 spatial frameworks identified (Spatial-AST, DCASE SELD, EINV2, DETR-slots)
- βœ… 3 parallel routes analyzed with full architecture specifications
- βœ… 6 mechanisms causing DOA train/val gaps discovered
- βœ… v7β†’v11 experimental series mapped with root cause tracing
- βœ… ~10,000+ lines of code reviewed and cross-referenced
- βœ… All findings tied to exact file:line numbers

**Quality Assurance**:
- Code references verified with actual line numbers
- Architecture descriptions validated against source
- Experimental hypotheses cross-checked with docstrings
- Cross-document consistency checked

---

## πŸš€ Recommended Reading Order

### **For Different Roles**

#### **Principal Investigator / Project Lead**
1. This index + section on "Critical Findings" (5 min)
2. `ANALYSIS_COMPLETION_SUMMARY.md` (10 min)
3. `0427_v11_series.md` Section 1 (diagnostics review) + Section 5 (order) (10 min)
β†’ **Decision**: Approve v11 experiments? (Total: 25 min)

#### **Researcher / Experimenter**
1. `SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md` (10 min)
2. `0427_v11_series.md` full document (15 min)
3. `SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md` Part 2 + 3 (25 min)
β†’ **Ready**: Design and run experiments (Total: 50 min)

#### **New Contributor / Intern**
1. `SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md` (10 min)
2. `SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md` Part 1 + 2 (25 min)
3. Pick a component (e.g., "LocalSpatialEncoder") β†’ find in Part 6 β†’ read code
4. `spatial_beats_training_overview.md` or `spatial_beats_coding_guide.md` as needed
β†’ **Goal**: Understand codebase (Total: 1-2 hours)

#### **Debugging Train/Val Gap**
1. `doa_train_valid_gap_analysis.md` Executive Summary + Part 6 (10 min)
2. Part 7 (diagnostics) β€” check your logs (10 min)
3. Part 8 (fixes) β€” pick priority #1-3 (5 min)
4. Appendix β€” get code locations (5 min)
β†’ **Goal**: Root cause + fix strategy (Total: 30 min)

---

## πŸ“ž FAQ About Documentation

**Q: Where do I find the code for Route A, B, or C?**
A: See `SPATIAL_AUDIO_FRAMEWORKS_ANALYSIS.md` Part 2.1/2.2/2.3, each has exact file:line references in Part 6 Appendix.

**Q: What's the difference between v9 and v10?**
A: `doa_train_valid_gap_analysis.md` Part 3.1 vs 3.2, or quick summary in `SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md`.

**Q: Should I implement fix #1, #2, or #3?**
A: Depends on your problem. See `doa_train_valid_gap_analysis.md` Part 6 β€” rank root causes by severity against your gap size.

**Q: How long will v11 experiments take?**
A: ~14 days total. See `0427_v11_series.md` Section 5 for recommended order (serial vs parallel).

**Q: Can I run v11 without understanding everything?**
A: Yes! Copy the shell script from `SPATIAL_FRAMEWORKS_QUICK_REFERENCE.md`, follow Part 4 (verification method) in `0427_v11_series.md`.

---

## βœ… Checklist: What You Can Do Now

After reading appropriate documentation:

- [ ] Understand what spatial frameworks exist in codebase
- [ ] Identify which route (A/B/C) solves your problem
- [ ] Diagnose your train/val gap (Part 6 in DOA_GAP)
- [ ] Plan an experiment (v11 specs + order)
- [ ] Find exact code locations (Appendix tables)
- [ ] Understand loss weight patterns (QUICK_REF tables)
- [ ] Know when to hot-start from which checkpoint
- [ ] Compare validation metrics across routes

---

## πŸ“„ License & Citation

These documents are analysis artifacts for internal research use. They synthesize information from:
- Source code comments and docstrings
- Configuration file specifications  
- DCASE challenge documentation (officially referenced in code)
- Research paper citations in docstrings

---

**Last Updated**: 2026-04-27  
**Analysis Completed**: Yes  
**Ready for Use**: Yes  
**Maintenance**: Update after v11 experiments complete

For questions or clarifications, refer to exact file:line citations in the appendices of technical documents.