feat: training speed optimizations — mixed precision, vectorized augmentation, cached eval predictions 1fe1a19 lemousehunter commited on 25 days ago
v3: Add DTP + Spectral Decoupling, fix GradNorm OOM, fix _fail_job cancel 283a882 lemousehunter commited on 25 days ago