HuggingFace's tiny but capable 360M parameter model. Q8_0 (369 MB). Loads in seconds.
SmolLM2-360M-Instruct via wllama WebGPU. Built for AMD Strix Halo unified memory.