Spectral Compact Training: Pre-Training Large Language Models via Permanent Truncated SVD and Stiefel QR Retraction Paper • 2604.00733 • Published 6 days ago • 1