KoinicLabs
/

AXL-Micro-600K

 ---
 license: apache-2.0
+language:
+  - code
+tags:
+  - code-generation
+  - multi-scale-transformer
+  - cpu-optimized
+  - koinic
+  - pytorch
+  - llama
+  - gguf
+  - byte-level
+pipeline_tag: text-generation
+library_name: transformers
+datasets:
+  - bigcode/starcoderdata
+  - theblackcat102/evol-codealpaca-v1
+widget:
+  - text: "To be or not to be"
+model-index:
+  - name: AXL-Micro-600K
+    results:
+      - task:
+          type: text-generation
+        metrics:
+          - name: Perplexity (byte-level)
+            type: perplexity
+            value: 1.04
 ---
+# AXL-Micro-600K
+Smallest AXL model. 677K params. PPL 1.04.. Context 256 bytes. Demo model. Part of the AXL model family by [KoinicLabs](https://huggingface.co/KoinicLabs).
+## Model Details
+| Property | Value |
+|----------|-------|
+| Developed by | [KoinicLabs](https://huggingface.co/KoinicLabs) |
+| Architecture | Multi-Scale Transformer |
+| Parameters | 677056 |
+| Optimizer | Lion |
+| Attention | SDPA |
+| Vocab Size | 258 (byte-level) |
+| Context Window | 256 bytes |
+| d_model | 64 |
+| Attention Heads | 4 |
+| Layers per Scale | 2 |
+| Downsample Factors | [1, 2, 4] |
+| License | Apache 2.0 |
+### Sources
+- **Repository:** [GitHub](https://github.com/Koinic/AXL)
+- **Organization:** [KoinicLabs](https://huggingface.co/KoinicLabs)
+## Uses
+### Direct Use
+Demo/testing model (Shakespeare).
+```python
+import torch
+from multiscale_transformer.model.model import MultiScaleTransformer
+from multiscale_transformer.training.tokenizer import ByteTokenizer
+ckpt = torch.load("axl_micro_600k.pt", map_location="cpu")
+model = MultiScaleTransformer(config)
+model.load_state_dict(ckpt["model_state_dict"])
+model.eval()
+tokenizer = ByteTokenizer()
+ids = torch.tensor([tokenizer.encode("def hello():")], dtype=torch.long)
+with torch.no_grad():
+    out = model.generate(ids, max_new_tokens=50, temperature=0.8)
+print(tokenizer.decode(out[0].tolist()))
+```
+### Out-of-Scope Use
+Not for production code generation. Not for code generation tasks. For integration with tools like Continue.dev, LlamaIndex, or LangChain, use the Python API server which provides OpenAI-compatible endpoints.
+## Bias, Risks, and Limitations
+Byte-level perplexity is not comparable to BPE-level perplexity. Shakespeare-trained demo model. Not for code generation. Note: GGUF files for Ollama use a simplified single-stack encoder. For full AXL quality, use the Python API server.
+### Recommendations
+- Use for prototyping and experimentation, not production code generation.
+- Byte-level perplexity (258 vocab) is not comparable to BPE-level perplexity (32K vocab).
+- For better results, use the Lion-optimized version if available.
+## Training Details
+### Training Data
+Retrained with Lion on Shakespeare. 2435 steps in 2 min. PPL 1.04.
+### Preprocessing
+Byte-level tokenization with vocabulary size 258 (256 bytes + BOS + EOS). No vocabulary training required.
+### Speeds, Sizes, Times
+| Metric | Value |
+|--------|-------|
+| Training Steps | 2435 |
+| Training Time | 2 min |
+| Final Loss | 0.0747 |
+## Evaluation
+### Metrics
+Perplexity on held-out Python code using byte-level tokenization.
+### Results
+| Metric | Value |
+|--------|-------|
+| Perplexity (byte-level) | 1.04 |
+| Final Loss | 0.0747 |
+| Training Steps | 2435 |
+| Training Time | 2 min |
+**Summary:** Demo model for testing architecture. Shakespeare-trained.
+## Environmental Impact
+| Property | Value |
+|----------|-------|
+| Hardware | AMD Ryzen 5 5600G |
+| Hours Used | 0.033 |
+| Carbon Emitted | 0.0014 kg CO2 |
+| Cloud Provider | None (local CPU) |
+## Technical Specifications
+### Model Architecture
+Multi-Scale Transformer with three parallel encoder stacks at resolution scales 1x, 2x, and 4x. Cross-scale attention connects all scale pairs. Adaptive gating fusion. SwiGLU feed-forward. RoPE positional encoding.
+### Compute Infrastructure
+| Property | Value |
+|----------|-------|
+| Hardware | AMD Ryzen 5 5600G (6 cores, 12 threads) |
+| RAM | 16 GB |
+| GPU | None (CPU-only) |
+## Citation
+```bibtex
+@misc{axl_2026,
+  title={AXL: AXL-Micro-600K - Multi-Scale Transformer for CPU Code Generation},
+  author={Koinic},
+  year={2026},
+  url={https://huggingface.co/KoinicLabs}
+}
+```
+## How to Get Started
+### With Ollama
+```bash
+ollama create axl-micro-600k -f Modelfile
+ollama run axl-micro-600k "def fibonacci():"
+```
+### With Python
+```python
+import torch
+from multiscale_transformer.model.config import load_config
+from multiscale_transformer.model.model import MultiScaleTransformer
+from multiscale_transformer.training.tokenizer import ByteTokenizer
+config = load_config("config.json")
+model = MultiScaleTransformer(config)
+ckpt = torch.load("axl_micro_600k.pt", map_location="cpu")
+model.load_state_dict(ckpt["model_state_dict"])
+model.eval()
+tokenizer = ByteTokenizer()
+prompt = "def fibonacci():"
+ids = torch.tensor([tokenizer.encode(prompt)], dtype=torch.long)
+with torch.no_grad():
+    out = model.generate(ids, max_new_tokens=100, temperature=0.8, top_k=40)
+print(tokenizer.decode(out[0].tolist()))
+```