AXL-Micro-8M

Specs Training Usage Download

SGD 10 min on Shakespeare. 1723 steps. Multi-scale helps even with SGD.

ollama create axl-micro-8m -f Modelfile
ollama run axl-micro-8m "def fibonacci():"

SGD baseline. Multi-scale architecture helps even without Lion optimizer.

File	Size	Format
F16 GGUF	15 MB	Full precision
Q4_K_M GGUF	15 MB	4-bit quantized

GGUF files work with Ollama and llama.cpp. Q4_K_M is about 3x smaller than F16.