KennedyOfficaly commited on
Commit
fdd28de
·
verified ·
1 Parent(s): 73c795b

Upload 15 files

Browse files
.gitattributes CHANGED
@@ -33,3 +33,9 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ axl_translate-q4_k_m_v2.gguf filter=lfs diff=lfs merge=lfs -text
37
+ axl-translate-f16-v2.gguf filter=lfs diff=lfs merge=lfs -text
38
+ axl-translate-f16-v3.gguf filter=lfs diff=lfs merge=lfs -text
39
+ axl-translate-f16-v4.gguf filter=lfs diff=lfs merge=lfs -text
40
+ axl-translate-f16.gguf filter=lfs diff=lfs merge=lfs -text
41
+ axl-translate-q4_k_m_real.gguf filter=lfs diff=lfs merge=lfs -text
Modelfile ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM ./axl_translate-f16.gguf
2
+
3
+ TEMPLATE """{{ .System }}
4
+
5
+ User: {{ .Prompt }}
6
+ Assistant: """
7
+
8
+ SYSTEM """You are AXL-Translate, a code generation assistant built by Koinic."""
9
+
10
+ PARAMETER temperature 0.8
11
+ PARAMETER top_k 40
12
+ PARAMETER top_p 0.9
13
+ PARAMETER repeat_penalty 1.1
14
+ PARAMETER num_ctx 256
README.md CHANGED
@@ -1,3 +1,190 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - code
5
+ tags:
6
+ - code-generation
7
+ - multi-scale-transformer
8
+ - cpu-optimized
9
+ - koinic
10
+ - pytorch
11
+ - llama
12
+ - gguf
13
+ - byte-level
14
+ - code-translation
15
+ - multi-language
16
+ pipeline_tag: text-generation
17
+ library_name: transformers
18
+ datasets:
19
+ - koinic/axl-translation-pairs
20
+ widget:
21
+ - text: "Translate:\ndef add(a, b):\n return a + b\nResult:"
22
+ model-index:
23
+ - name: AXL-Translate
24
+ results:
25
+ - task:
26
+ type: text-generation
27
+ metrics:
28
+ - name: Perplexity (byte-level)
29
+ type: perplexity
30
+ value: 1.86
31
  ---
32
+
33
+ # AXL-Translate
34
+
35
+ Code translation. Python-JS-Rust. 15.2M params. PPL 1.86. Context 256 bytes. Part of the AXL model family by [KoinicLabs](https://huggingface.co/KoinicLabs).
36
+
37
+ ## Model Details
38
+
39
+ | Property | Value |
40
+ |----------|-------|
41
+ | Developed by | [KoinicLabs](https://huggingface.co/KoinicLabs) |
42
+ | Architecture | Multi-Scale Transformer |
43
+ | Parameters | 15M |
44
+ | Optimizer | SGD |
45
+ | Attention | SDPA |
46
+ | Vocab Size | 258 (byte-level) |
47
+ | Context Window | 256 bytes |
48
+ | d_model | 256 |
49
+ | Attention Heads | 4 |
50
+ | Layers per Scale | 4 |
51
+ | Downsample Factors | [1, 2, 4] |
52
+ | License | Apache 2.0 |
53
+
54
+ ### Sources
55
+
56
+ - **Repository:** [GitHub](https://github.com/Koinic/AXL)
57
+ - **Organization:** [KoinicLabs](https://huggingface.co/KoinicLabs)
58
+
59
+ ## Uses
60
+
61
+ ### Direct Use
62
+
63
+ Code translation between Python, JavaScript, Rust.
64
+
65
+ ```python
66
+ import torch
67
+ from multiscale_transformer.model.model import MultiScaleTransformer
68
+ from multiscale_transformer.training.tokenizer import ByteTokenizer
69
+ ckpt = torch.load("axl_translate.pt", map_location="cpu")
70
+ model = MultiScaleTransformer(config)
71
+ model.load_state_dict(ckpt["model_state_dict"])
72
+ model.eval()
73
+ tokenizer = ByteTokenizer()
74
+ ids = torch.tensor([tokenizer.encode("def hello():")], dtype=torch.long)
75
+ with torch.no_grad():
76
+ out = model.generate(ids, max_new_tokens=50, temperature=0.8)
77
+ print(tokenizer.decode(out[0].tolist()))
78
+ ```
79
+
80
+ ### Out-of-Scope Use
81
+
82
+ Not for general code generation. Translation-specific model only. For integration with tools like Continue.dev, LlamaIndex, or LangChain, use the Python API server which provides OpenAI-compatible endpoints.
83
+
84
+ ## Bias, Risks, and Limitations
85
+
86
+ Byte-level perplexity is not comparable to BPE-level perplexity. Specialized for translation. Python/JS/Rust only. Max context 256 bytes. Note: GGUF files for Ollama use a simplified single-stack encoder. For full AXL quality, use the Python API server.
87
+
88
+ ### Recommendations
89
+
90
+ - Use for prototyping and experimentation, not production code generation.
91
+ - Byte-level perplexity (258 vocab) is not comparable to BPE-level perplexity (32K vocab).
92
+ - For better results, use the Lion-optimized version if available.
93
+
94
+ ## Training Details
95
+
96
+ ### Training Data
97
+
98
+ Rewritten from numpy to PyTorch. Trained with Lion on 3MB translation pairs. 10 min.
99
+
100
+ ### Preprocessing
101
+
102
+ Byte-level tokenization with vocabulary size 258 (256 bytes + BOS + EOS). No vocabulary training required.
103
+
104
+ ### Speeds, Sizes, Times
105
+
106
+ | Metric | Value |
107
+ |--------|-------|
108
+ | Training Steps | 3942 |
109
+ | Training Time | 10 min |
110
+ | Final Loss | 0.6703 |
111
+
112
+ ## Evaluation
113
+
114
+ ### Metrics
115
+
116
+ Perplexity on held-out Python code using byte-level tokenization.
117
+
118
+ ### Results
119
+
120
+ | Metric | Value |
121
+ |--------|-------|
122
+ | Perplexity (byte-level) | 1.86 |
123
+ | Final Loss | 0.6703 |
124
+ | Training Steps | 3942 |
125
+ | Training Time | 10 min |
126
+
127
+ **Summary:** Converts code between Python, JavaScript, and Rust.
128
+
129
+ ## Environmental Impact
130
+
131
+ | Property | Value |
132
+ |----------|-------|
133
+ | Hardware | AMD Ryzen 5 5600G |
134
+ | Hours Used | 0.167 |
135
+ | Carbon Emitted | 0.0070 kg CO2 |
136
+ | Cloud Provider | None (local CPU) |
137
+
138
+ ## Technical Specifications
139
+
140
+ ### Model Architecture
141
+
142
+ Multi-Scale Transformer with three parallel encoder stacks at resolution scales 1x, 2x, and 4x. Cross-scale attention connects all scale pairs. Adaptive gating fusion. SwiGLU feed-forward. RoPE positional encoding.
143
+
144
+ ### Compute Infrastructure
145
+
146
+ | Property | Value |
147
+ |----------|-------|
148
+ | Hardware | AMD Ryzen 5 5600G (6 cores, 12 threads) |
149
+ | RAM | 16 GB |
150
+ | GPU | None (CPU-only) |
151
+
152
+ ## Citation
153
+
154
+ ```bibtex
155
+ @misc{axl_2026,
156
+ title={AXL: AXL-Translate - Multi-Scale Transformer for CPU Code Generation},
157
+ author={Koinic},
158
+ year={2026},
159
+ url={https://huggingface.co/KoinicLabs}
160
+ }
161
+ ```
162
+
163
+ ## How to Get Started
164
+
165
+ ### With Ollama
166
+
167
+ ```bash
168
+ ollama create axl-translate -f Modelfile
169
+ ollama run axl-translate "def fibonacci():"
170
+ ```
171
+
172
+ ### With Python
173
+
174
+ ```python
175
+ import torch
176
+ from multiscale_transformer.model.config import load_config
177
+ from multiscale_transformer.model.model import MultiScaleTransformer
178
+ from multiscale_transformer.training.tokenizer import ByteTokenizer
179
+ config = load_config("config.json")
180
+ model = MultiScaleTransformer(config)
181
+ ckpt = torch.load("axl_translate.pt", map_location="cpu")
182
+ model.load_state_dict(ckpt["model_state_dict"])
183
+ model.eval()
184
+ tokenizer = ByteTokenizer()
185
+ prompt = "def fibonacci():"
186
+ ids = torch.tensor([tokenizer.encode(prompt)], dtype=torch.long)
187
+ with torch.no_grad():
188
+ out = model.generate(ids, max_new_tokens=100, temperature=0.8, top_k=40)
189
+ print(tokenizer.decode(out[0].tolist()))
190
+ ```
axl-translate-f16-v2.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:36d437c8e240109075713e8613d38f17cbe3b56005c95b16061bb0ee649113db
3
+ size 30503808
axl-translate-f16-v3.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:866df01df576fd6409a0acae7a2176de70cfd74c46aeecc14c07c96b2f9d5086
3
+ size 30503808
axl-translate-f16-v4.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c56f7795cf9740eb06ff9f3d20e45928ef378d7f46b73104dad4f67fb7e3348c
3
+ size 30500064
axl-translate-f16.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2109bac554beb4528862a702fe0ed734aed6d186d7ea7bdf87813d005a2ea890
3
+ size 30504224
axl-translate-q4_k_m_real.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cdbcf6d0d9d822c370b59d9e3f9d7184d6fabffb67f656ecf293ef72b5ce94a3
3
+ size 17935136
axl_translate-q4_k_m_v2.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bf2d3372888fd79686eeea9dab8f112b547eba7066e20f35a4434ceba8b1f2f3
3
+ size 17934656
axl_translate.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6fb76618616b6bab83ac19c4f6c7acb8306e31cd4409a542dc2e891d7c9c5b62
3
+ size 60789239
config.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "multiscale_transformer",
3
+ "architectures": [
4
+ "MultiScaleForCausalLM"
5
+ ],
6
+ "vocab_size": 258,
7
+ "d_model": 256,
8
+ "n_heads": 4,
9
+ "d_ff": 688,
10
+ "n_layers_per_scale": 4,
11
+ "n_cross_attn_layers": 1,
12
+ "max_seq_len": 256,
13
+ "dropout": 0.0,
14
+ "bias": false,
15
+ "rope_theta": 10000.0,
16
+ "downsample_factors": [
17
+ 1,
18
+ 2,
19
+ 4
20
+ ],
21
+ "num_parameters": 14376192,
22
+ "training_results": {
23
+ "model": "AXL-Translate",
24
+ "params": 15174400,
25
+ "steps": 3942,
26
+ "time": 600.0003161430359,
27
+ "final_loss": 0.6703445315361023,
28
+ "perplexity": 1.86,
29
+ "max_seq_len": 256
30
+ }
31
+ }
generation_config.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "max_new_tokens": 256,
3
+ "temperature": 0.8,
4
+ "top_k": 40,
5
+ "top_p": 0.9,
6
+ "repetition_penalty": 1.1,
7
+ "do_sample": true
8
+ }
index.html ADDED
@@ -0,0 +1,95 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width,initial-scale=1.0">
6
+ <title>AXL-Translate - AXL</title>
7
+ <style>*{margin:0;padding:0;box-sizing:border-box}
8
+ body{font-family:-apple-system,BlinkMacSystemFont,Segoe UI,Roboto,sans-serif;background:#0d1117;color:#c9d1d9;line-height:1.6}
9
+ a{color:#58a6ff;text-decoration:none}a:hover{text-decoration:underline}
10
+ .hero{padding:40px 20px;text-align:center;border-bottom:1px solid #30363d;background:linear-gradient(135deg,#0d1117,#161b22,#0d1117)}
11
+ .hero h1{font-size:2.2rem;color:#fff;letter-spacing:-1px}
12
+ .cat{display:inline-block;padding:3px 12px;border-radius:12px;font-size:.75rem;font-weight:600;margin-bottom:12px}
13
+ .cat.Lion{background:#1f3a5f;color:#4285f4}
14
+ .cat.SGD{background:#3d1f1f;color:#f85149}
15
+ .cat.Specialized{background:#2d1b69;color:#bb86fc}
16
+ .desc{color:#8b949e;font-size:.95rem;max-width:600px;margin:12px auto 0}
17
+ .ms{display:flex;flex-wrap:wrap;gap:12px;justify-content:center;padding:24px 20px}
18
+ .mc{background:#161b22;border:1px solid #30363d;border-radius:10px;padding:16px 24px;text-align:center;min-width:120px}
19
+ .v{font-size:1.5rem;font-weight:700;color:#fff}.l{font-size:.75rem;color:#8b949e;margin-top:2px}
20
+ .tabs{max-width:800px;margin:0 auto;padding:0 20px}
21
+ .tabs>input[type=radio]{display:none}
22
+ .tl{display:inline-block;background:#21262d;border:1px solid #30363d;color:#8b949e;padding:7px 16px;border-radius:8px;cursor:pointer;font-size:.85rem;margin:0 4px 16px;transition:all .2s}
23
+ .tl:hover{background:#30363d;color:#c9d1d9}
24
+ .p{display:none;background:#161b22;border:1px solid #30363d;border-radius:12px;padding:24px;margin-bottom:24px}
25
+ #t1:checked~.p1,#t2:checked~.p2,#t3:checked~.p3,#t4:checked~.p4{display:block}
26
+ #t1:checked+label[for=t1],#t2:checked+label[for=t2],#t3:checked+label[for=t3],#t4:checked+label[for=t4]{background:#bb86fc;color:#fff;border-color:#bb86fc}
27
+ table{width:100%;border-collapse:collapse}
28
+ th{text-align:left;color:#8b949e;font-size:.8rem;padding:8px 12px;border-bottom:1px solid #21262d;font-weight:600}
29
+ td{padding:8px 12px;font-size:.9rem;border-bottom:1px solid #21262d}
30
+ pre{background:#0d1117;padding:14px;border-radius:8px;overflow-x:auto;margin:12px 0}
31
+ code{color:#c9d1d9;font-size:.82rem;line-height:1.5}
32
+ .note{background:#21262d;border-left:3px solid #bb86fc;padding:12px 16px;border-radius:0 8px 8px 0;margin:12px 0;font-size:.85rem;color:#8b949e}
33
+ .story{font-size:.9rem;color:#8b949e;line-height:1.6;margin:8px 0}
34
+ .back{text-align:center;padding:24px 20px 40px}
35
+ .back a{color:#58a6ff;font-size:.9rem}
36
+ @media(max-width:768px){.hero h1{font-size:1.6rem}.ms{flex-direction:column;align-items:center}.mc{min-width:200px}}</style>
37
+ </head>
38
+ <body>
39
+ <div class="hero">
40
+ <div class="cat Specialized">Specialized Optimized</div>
41
+ <h1>AXL-Translate</h1>
42
+ <p class="desc">Code translation. Python-JS-Rust. 15.2M params. PPL 1.86. Context 256 bytes.</p>
43
+ </div>
44
+ <div class="ms">
45
+ <div class="mc"><div class="v">15M</div><div class="l">Parameters</div></div>
46
+ <div class="mc"><div class="v">1.86</div><div class="l">Perplexity</div></div>
47
+ <div class="mc"><div class="v">10 min</div><div class="l">Training</div></div>
48
+ <div class="mc"><div class="v">31 MB</div><div class="l">GGUF</div></div>
49
+ </div>
50
+ <div class="tabs">
51
+ <input type="radio" name="t" id="t1" checked><label for="t1" class="tl">Specs</label>
52
+ <input type="radio" name="t" id="t2"><label for="t2" class="tl">Training</label>
53
+ <input type="radio" name="t" id="t3"><label for="t3" class="tl">Usage</label>
54
+ <input type="radio" name="t" id="t4"><label for="t4" class="tl">Download</label>
55
+ <div class="p p1">
56
+ <table>
57
+ <tr><th>Property</th><th>Value</th></tr>
58
+ <tr><td>Architecture</td><td>Multi-Scale Transformer</td></tr>
59
+ <tr><td>d_model</td><td>?</td></tr>
60
+ <tr><td>Attention Heads</td><td>?</td></tr>
61
+ <tr><td>Layers per Scale</td><td>?</td></tr>
62
+ <tr><td>Context Window</td><td>256 bytes</td></tr>
63
+ <tr><td>Downsample Factors</td><td>[1, 2, 4]</td></tr>
64
+ <tr><td>Vocab Size</td><td>258 (byte-level)</td></tr>
65
+ <tr><td>Optimizer</td><td>SGD</td></tr>
66
+ </table>
67
+ </div>
68
+ <div class="p p2">
69
+ <div class="story">Rewritten from numpy to PyTorch. Trained with Lion on 3MB translation pairs. 10 min.</div>
70
+ <table>
71
+ <tr><th>Metric</th><th>Value</th></tr>
72
+ <tr><td>Final Loss</td><td>0.6703</td></tr>
73
+ <tr><td>Perplexity</td><td>1.86</td></tr>
74
+ <tr><td>Training Steps</td><td>3942</td></tr>
75
+ <tr><td>Training Time</td><td>10 min</td></tr>
76
+ </table>
77
+ </div>
78
+ <div class="p p3">
79
+ <h3 style="color:#fff;margin-bottom:12px">Usage</h3>
80
+ <pre><code>ollama create axl-translate -f Modelfile
81
+ ollama run axl-translate "def fibonacci():"</code></pre>
82
+ <div class="note">Converts code between Python, JavaScript, and Rust.</div>
83
+ </div>
84
+ <div class="p p4">
85
+ <table>
86
+ <tr><th>File</th><th>Size</th><th>Format</th></tr>
87
+ <tr><td>F16 GGUF</td><td>31 MB</td><td>Full precision</td></tr>
88
+ <tr><td>Q4_K_M GGUF</td><td>18 MB</td><td>4-bit quantized</td></tr>
89
+ </table>
90
+ <div class="note" style="margin-top:16px">GGUF files work with Ollama and llama.cpp. Q4_K_M is about 3x smaller than F16.</div>
91
+ </div>
92
+ </div>
93
+ <div class="back"><a href="../">← All AXL Models</a></div>
94
+ </body>
95
+ </html>
results.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model": "AXL-Translate",
3
+ "params": 15174400,
4
+ "steps": 3942,
5
+ "time": 600.0003161430359,
6
+ "final_loss": 0.6703445315361023,
7
+ "perplexity": 1.86,
8
+ "max_seq_len": 256
9
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "pad_token": "[PAD]",
3
+ "bos_token": "[BOS]",
4
+ "eos_token": "[EOS]",
5
+ "unk_token": "[UNK]"
6
+ }
tokenizer_config.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "tokenizer_class": "ByteTokenizer",
3
+ "vocab_size": 258,
4
+ "pad_token": "[PAD]",
5
+ "bos_token": "[BOS]",
6
+ "eos_token": "[EOS]",
7
+ "unk_token": "[UNK]"
8
+ }