The largest Lion model. 318M params trained in 20 min. PPL 1.90. Context 256 bytes.
| Property | Value |
|---|---|
| Architecture | Multi-Scale Transformer |
| d_model | ? |
| Attention Heads | ? |
| Layers per Scale | ? |
| Context Window | 256 bytes |
| Downsample Factors | [1, 2, 4] |
| Vocab Size | 258 (byte-level) |
| Optimizer | Lion |
| Metric | Value |
|---|---|
| Final Loss | 0.6338 |
| Perplexity | 1.90 |
| Training Steps | 421 |
| Training Time | 20 min |
ollama create axl-code-1b-lion -f Modelfile
ollama run axl-code-1b-lion "def fibonacci():"
| File | Size | Format |
|---|---|---|
| F16 GGUF | 636 MB | Full precision |
| Q4_K_M GGUF | 197 MB | 4-bit quantized |