| # Debug Log |
|
|
| ## Sanity Check Results (270 checks across 54+36 configs) |
|
|
| ### Poisson-Gamma VI (198 checks) |
| - **Parameters positive**: 198/198 β
|
| - **No NaN**: 198/198 β
|
| - **Responsibilities sum to 1**: 198/198 β
|
| - **ELBO finite**: 198/198 β
|
| - **Exact differs from full**: 174/198 (24 trivial deletions with near-zero edge counts) |
| - **Error decreases with R**: ~90% (failures in high-coupling regimes) |
|
|
| ### Gaussian-Gaussian VI (36 checks) |
| - **Error decreases with R**: ~85% |
| - Gaussian VI converges reliably |
|
|
| ### Gaussian-Gamma MAP (36 checks) |
| - **Error decreases with R**: ~70% |
| - Higher failure rate due to non-convex optimization |
| - All runs use Adam optimizer (lr=0.05, grad_clip=10, max_iter=2000) |
|
|
| ## Numerical Issues |
|
|
| ### CAVI Convergence |
| - Many configurations hit max_iter=200-300 without strict convergence (tol=1e-5) |
| - Parameters stabilize well before the tolerance threshold |
| - Weak priors (a0=b0=0.1) with high count scales produce the slowest convergence |
| |
| ### Gaussian-Gamma MAP Optimizer |
| - v1 (vanilla SGD, lr=0.005, max_iter=200): **broken** β error increased with R, only 57% positive decay |
| - v2 (Adam, lr=0.05, grad_clip=10, max_iter=2000): **fixed** β error decreases monotonically, 54% positive decay |
| - The remaining 46% with non-positive decay is inherent to MAP: different optimization paths for full vs exact-deletion |
| - Objective trace shows convergence by ~1500 iterations (plateau) |
|
|
| ### Chi Proxy Anomaly |
| - **Finding**: Ο_max(z) correlates *negatively* with local error (Spearman Ο = -0.28 to -0.50 within regimes) |
| - **Expected**: positive correlation (higher Ο β harder β higher error) |
| - **Explanation**: The Dobrushin bound is a *sufficient* condition, not tight. High Ο can coexist with fast empirical decay because: |
| 1. The bound takes worst-case over operator norms |
| 2. High-degree nodes have high Ο but their neighborhoods capture more of the relevant graph |
| 3. The actual deletion influence depends on the specific edge structure, not just the bound |
| - **Impact on paper**: Report honestly. The theory gives valid locality *guarantees* but Ο is not a practical *predictor* of deletion difficulty. |
| |
| ## Exclusion Rules |
| - Configs with fewer than 10 edges after count generation: skipped |
| - Decay fits with fewer than 3 valid distance shells: marked invalid |
| - 24/198 PG deletions where exact β full (trivial edges): included in data but noted |
| |
| ## MovieLens Binary Caveat |
| - All observations are x_ij = 1, producing near-zero per-edge influence |
| - RelErr(R=2) < 10^{-4}: the deletion is trivially local |
| - Included for completeness but not informative for testing the theory |
| - The rating-count transformation is more meaningful |
|
|
| ## Runtime Analysis |
| - Local R=1 gives 2.9-3.0x speedup (edge filtering works) |
| - Local Rβ₯2 gives minimal speedup at N=300 (neighborhood covers most of graph) |
| - Speedup should scale as O(N/d^R) on large sparse graphs |
|
|
| ## File Manifest |
| - Sanity checks: results/raw/sanity_*.jsonl (270 records) |
| - Synthetic: results/raw/full_synthetic_*.jsonl (2700 records) |
| - Model family: results/raw/model_family_v2_*.jsonl (1080 records) |
| - Real data: results/raw/real_scaled_*.jsonl (600 records) |
| - Total: 4,650 records |
|
|
| ## Timestamp |
| Generated: 2026-04-25 |
|
|