MCAP: Deployment-Time Layer Profiling for Memory-Constrained LLM Inference Paper • 2604.21026 • Published Apr 24 • 1