| ---
|
| license: apache-2.0
|
| language:
|
| - code
|
| tags:
|
| - code-generation
|
| - multi-scale-transformer
|
| - cpu-optimized
|
| - koinic
|
| - pytorch
|
| - llama
|
| - gguf
|
| - byte-level
|
| - commenting
|
| pipeline_tag: text-generation
|
| library_name: transformers
|
| datasets:
|
| - bigcode/starcoderdata
|
| - theblackcat102/evol-codealpaca-v1
|
| widget:
|
| - text: "Code:\ndef quicksort(arr):\n if len(arr) <= 1: return arr\nCommented:"
|
| model-index:
|
| - name: AXL-Comment-5M
|
| results:
|
| - task:
|
| type: text-generation
|
| metrics:
|
| - name: Perplexity (byte-level)
|
| type: perplexity
|
| value: 1.01
|
| ---
|
|
|
| # AXL-Comment-5M
|
|
|
| Code commenting. 7.2M params. PPL 1.01. Context 512 bytes. Part of the AXL model family by [KoinicLabs](https://huggingface.co/KoinicLabs).
|
|
|
| ## Model Details
|
|
|
| | Property | Value |
|
| |----------|-------|
|
| | Developed by | [KoinicLabs](https://huggingface.co/KoinicLabs) |
|
| | Architecture | Multi-Scale Transformer |
|
| | Parameters | 7M |
|
| | Optimizer | Lion |
|
| | Attention | SDPA |
|
| | Vocab Size | 258 (byte-level) |
|
| | Context Window | 512 bytes |
|
| | d_model | 192 |
|
| | Attention Heads | 3 |
|
| | Layers per Scale | 3 |
|
| | Downsample Factors | [1, 2, 4] |
|
| | License | Apache 2.0 |
|
|
|
| ### Sources
|
|
|
| - **Repository:** [GitHub](https://github.com/Koinic/AXL)
|
| - **Organization:** [KoinicLabs](https://huggingface.co/KoinicLabs)
|
|
|
| ## Uses
|
|
|
| ### Direct Use
|
|
|
| Code commenting.
|
|
|
| ```python
|
| import torch
|
| from multiscale_transformer.model.model import MultiScaleTransformer
|
| from multiscale_transformer.training.tokenizer import ByteTokenizer
|
| ckpt = torch.load("axl_comment_5m.pt", map_location="cpu")
|
| model = MultiScaleTransformer(config)
|
| model.load_state_dict(ckpt["model_state_dict"])
|
| model.eval()
|
| tokenizer = ByteTokenizer()
|
| ids = torch.tensor([tokenizer.encode("def hello():")], dtype=torch.long)
|
| with torch.no_grad():
|
| out = model.generate(ids, max_new_tokens=50, temperature=0.8)
|
| print(tokenizer.decode(out[0].tolist()))
|
| ```
|
|
|
| ### Out-of-Scope Use
|
|
|
| Not for production code generation. Not for non-code NLP tasks. For integration with tools like Continue.dev, LlamaIndex, or LangChain, use the Python API server which provides OpenAI-compatible endpoints.
|
|
|
| ## Bias, Risks, and Limitations
|
|
|
| Byte-level perplexity is not comparable to BPE-level perplexity. Max context 512 bytes. Note: GGUF files for Ollama use a simplified single-stack encoder. For full AXL quality, use the Python API server.
|
|
|
| ### Recommendations
|
|
|
| - Use for prototyping and experimentation, not production code generation.
|
| - Byte-level perplexity (258 vocab) is not comparable to BPE-level perplexity (32K vocab).
|
| - For better results, use the Lion-optimized version if available.
|
|
|
| ## Training Details
|
|
|
| ### Training Data
|
|
|
| Retrained with Lion on 20MB commenting pairs. 263 steps in 10 min.
|
|
|
| ### Preprocessing
|
|
|
| Byte-level tokenization with vocabulary size 258 (256 bytes + BOS + EOS). No vocabulary training required.
|
|
|
| ### Speeds, Sizes, Times
|
|
|
| | Metric | Value |
|
| |--------|-------|
|
| | Training Steps | 263 |
|
| | Training Time | 10 min |
|
| | Final Loss | 0.1476 |
|
|
|
| ## Evaluation
|
|
|
| ### Metrics
|
|
|
| Perplexity on held-out Python code using byte-level tokenization.
|
|
|
| ### Results
|
|
|
| | Metric | Value |
|
| |--------|-------|
|
| | Perplexity (byte-level) | 1.01 |
|
| | Final Loss | 0.1476 |
|
| | Training Steps | 263 |
|
| | Training Time | 10 min |
|
|
|
| **Summary:** Adds inline comments to explain code logic.
|
|
|
| ## Environmental Impact
|
|
|
| | Property | Value |
|
| |----------|-------|
|
| | Hardware | AMD Ryzen 5 5600G |
|
| | Hours Used | 0.167 |
|
| | Carbon Emitted | 0.0070 kg CO2 |
|
| | Cloud Provider | None (local CPU) |
|
|
|
| ## Technical Specifications
|
|
|
| ### Model Architecture
|
|
|
| Multi-Scale Transformer with three parallel encoder stacks at resolution scales 1x, 2x, and 4x. Cross-scale attention connects all scale pairs. Adaptive gating fusion. SwiGLU feed-forward. RoPE positional encoding.
|
|
|
| ### Compute Infrastructure
|
|
|
| | Property | Value |
|
| |----------|-------|
|
| | Hardware | AMD Ryzen 5 5600G (6 cores, 12 threads) |
|
| | RAM | 16 GB |
|
| | GPU | None (CPU-only) |
|
|
|
| ## Citation
|
|
|
| ```bibtex
|
| @misc{axl_2026,
|
| title={AXL: AXL-Comment-5M - Multi-Scale Transformer for CPU Code Generation},
|
| author={Koinic},
|
| year={2026},
|
| url={https://huggingface.co/KoinicLabs}
|
| }
|
| ```
|
|
|
| ## How to Get Started
|
|
|
| ### With Ollama
|
|
|
| ```bash
|
| ollama create axl-comment-5m -f Modelfile
|
| ollama run axl-comment-5m "def fibonacci():"
|
| ```
|
|
|
| ### With Python
|
|
|
| ```python
|
| import torch
|
| from multiscale_transformer.model.config import load_config
|
| from multiscale_transformer.model.model import MultiScaleTransformer
|
| from multiscale_transformer.training.tokenizer import ByteTokenizer
|
| config = load_config("config.json")
|
| model = MultiScaleTransformer(config)
|
| ckpt = torch.load("axl_comment_5m.pt", map_location="cpu")
|
| model.load_state_dict(ckpt["model_state_dict"])
|
| model.eval()
|
| tokenizer = ByteTokenizer()
|
| prompt = "def fibonacci():"
|
| ids = torch.tensor([tokenizer.encode(prompt)], dtype=torch.long)
|
| with torch.no_grad():
|
| out = model.generate(ids, max_new_tokens=100, temperature=0.8, top_k=40)
|
| print(tokenizer.decode(out[0].tolist()))
|
| ```
|
|
|