Spaces:
Running
Running
metadata
title: LLM Inference Simulator — Extended
emoji: 🚀
colorFrom: gray
colorTo: purple
sdk: static
pinned: false
LLM Inference — Pipeline Simulator v2
Extended version covering concurrency, batching, prefix caching, and memory constraints. From EXD Episode 4 — Performance Tuning.
Built with vanilla HTML/CSS/JS.
📺 Watch the episode | 💻 GitHub