NerVE: Nonlinear Eigenspectrum Dynamics in LLM Feed-Forward Networks Paper • 2603.06922 • Published 6 days ago
A Random Matrix Theory Perspective on the Learning Dynamics of Multi-head Latent Attention Paper • 2507.09394 • Published Jul 12, 2025
Spectral Scaling Laws in Language Models: How Effectively Do Feed-Forward Networks Use Their Latent Space? Paper • 2510.00537 • Published Oct 1, 2025 • 3
Spectral Scaling Laws in Language Models: How Effectively Do Feed-Forward Networks Use Their Latent Space? Paper • 2510.00537 • Published Oct 1, 2025 • 3 • 2
AERO: Softmax-Only LLMs for Efficient Private Inference Paper • 2410.13060 • Published Oct 16, 2024 • 4
AERO: Softmax-Only LLMs for Efficient Private Inference Paper • 2410.13060 • Published Oct 16, 2024 • 4 • 2
ReLU's Revival: On the Entropic Overload in Normalization-Free Large Language Models Paper • 2410.09637 • Published Oct 12, 2024 • 3 • 2
DeepReShape: Redesigning Neural Networks for Efficient Private Inference Paper • 2304.10593 • Published Apr 20, 2023
ReLU's Revival: On the Entropic Overload in Normalization-Free Large Language Models Paper • 2410.09637 • Published Oct 12, 2024 • 3
Sisyphus: A Cautionary Tale of Using Low-Degree Polynomial Activations in Privacy-Preserving Deep Learning Paper • 2107.12342 • Published Jul 26, 2021
CryptoNite: Revealing the Pitfalls of End-to-End Private Inference at Scale Paper • 2111.02583 • Published Nov 4, 2021
Modeling Data Reuse in Deep Neural Networks by Taking Data-Types into Cognizance Paper • 2008.02565 • Published Aug 6, 2020