SGD Optimized

AXL-Refactor-20M

Refactoring (SGD). 19.1M params. PPL 1.01. Context 1024 bytes.

19M

Parameters

1.01

Perplexity

5 min

Training

38 MB

GGUF

Specs Training Usage Download

Property	Value
Architecture	Multi-Scale Transformer
d_model	?
Attention Heads	?
Layers per Scale	?
Context Window	1024 bytes
Downsample Factors	[1, 2, 4]
Vocab Size	258 (byte-level)
Optimizer	SGD

Trained on 7MB before/after pairs. 202 steps.

Metric	Value
Final Loss	0.0081
Perplexity	1.01
Training Steps	202
Training Time	5 min

Usage

ollama create axl-refactor-20m -f Modelfile
ollama run axl-refactor-20m "def fibonacci():"

Refactoring baseline. AXL-Refactor-Lion has PPL 1.11.

File	Size	Format
F16 GGUF	38 MB	Full precision
Q4_K_M GGUF	38 MB	4-bit quantized

GGUF files work with Ollama and llama.cpp. Q4_K_M is about 3x smaller than F16.

← All AXL Models