view article Article SDXL in 4 steps with Latent Consistency LoRAs +5 pcuenq, valhalla, SimianLuo, dg845, tyq1024, sayakpaul, multimodalart • Nov 9, 2023 • 15
Efficient Memory Management for Large Language Model Serving with PagedAttention Paper • 2309.06180 • Published Sep 12, 2023 • 54
view article Article From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels drbh, danieldk • Aug 18, 2025 • 97