KennedyOfficaly commited on
Commit
8299d9f
·
verified ·
1 Parent(s): 0318953

Upload 12 files

Browse files
.gitattributes CHANGED
@@ -33,3 +33,6 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ axl_chat-q4_k_m_v2.gguf filter=lfs diff=lfs merge=lfs -text
37
+ axl-chat-f16.gguf filter=lfs diff=lfs merge=lfs -text
38
+ axl-chat-q4_k_m.gguf filter=lfs diff=lfs merge=lfs -text
Modelfile ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM ./axl_chat_10m-f16.gguf
2
+
3
+ TEMPLATE """{{ .System }}
4
+
5
+ User: {{ .Prompt }}
6
+ Assistant: """
7
+
8
+ SYSTEM """You are AXL-Chat-10M, a code generation assistant built by Koinic."""
9
+
10
+ PARAMETER temperature 0.8
11
+ PARAMETER top_k 40
12
+ PARAMETER top_p 0.9
13
+ PARAMETER repeat_penalty 1.1
14
+ PARAMETER num_ctx 256
README.md CHANGED
@@ -1,3 +1,189 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - code
5
+ tags:
6
+ - code-generation
7
+ - multi-scale-transformer
8
+ - cpu-optimized
9
+ - koinic
10
+ - pytorch
11
+ - llama
12
+ - gguf
13
+ - byte-level
14
+ - conversational
15
+ pipeline_tag: text-generation
16
+ library_name: transformers
17
+ datasets:
18
+ - koinic/axl-chat-pairs
19
+ widget:
20
+ - text: "User: How do I read a CSV in Python?\nAssistant:"
21
+ model-index:
22
+ - name: AXL-Chat-10M
23
+ results:
24
+ - task:
25
+ type: text-generation
26
+ metrics:
27
+ - name: Perplexity (byte-level)
28
+ type: perplexity
29
+ value: 1.02
30
  ---
31
+
32
+ # AXL-Chat-10M
33
+
34
+ Conversational AI. 9.9M params. PPL 1.02. Context 512 bytes. Part of the AXL model family by [KoinicLabs](https://huggingface.co/KoinicLabs).
35
+
36
+ ## Model Details
37
+
38
+ | Property | Value |
39
+ |----------|-------|
40
+ | Developed by | [KoinicLabs](https://huggingface.co/KoinicLabs) |
41
+ | Architecture | Multi-Scale Transformer |
42
+ | Parameters | 10M |
43
+ | Optimizer | Lion |
44
+ | Attention | SDPA |
45
+ | Vocab Size | 258 (byte-level) |
46
+ | Context Window | 512 bytes |
47
+ | d_model | 224 |
48
+ | Attention Heads | 4 |
49
+ | Layers per Scale | 3 |
50
+ | Downsample Factors | [1, 2, 4] |
51
+ | License | Apache 2.0 |
52
+
53
+ ### Sources
54
+
55
+ - **Repository:** [GitHub](https://github.com/Koinic/AXL)
56
+ - **Organization:** [KoinicLabs](https://huggingface.co/KoinicLabs)
57
+
58
+ ## Uses
59
+
60
+ ### Direct Use
61
+
62
+ Conversational AI for programming Q&A.
63
+
64
+ ```python
65
+ import torch
66
+ from multiscale_transformer.model.model import MultiScaleTransformer
67
+ from multiscale_transformer.training.tokenizer import ByteTokenizer
68
+ ckpt = torch.load("axl_chat_10m.pt", map_location="cpu")
69
+ model = MultiScaleTransformer(config)
70
+ model.load_state_dict(ckpt["model_state_dict"])
71
+ model.eval()
72
+ tokenizer = ByteTokenizer()
73
+ ids = torch.tensor([tokenizer.encode("def hello():")], dtype=torch.long)
74
+ with torch.no_grad():
75
+ out = model.generate(ids, max_new_tokens=50, temperature=0.8)
76
+ print(tokenizer.decode(out[0].tolist()))
77
+ ```
78
+
79
+ ### Out-of-Scope Use
80
+
81
+ Not for production code generation. Not for non-code NLP tasks. For integration with tools like Continue.dev, LlamaIndex, or LangChain, use the Python API server which provides OpenAI-compatible endpoints.
82
+
83
+ ## Bias, Risks, and Limitations
84
+
85
+ Byte-level perplexity is not comparable to BPE-level perplexity. Max context 512 bytes. Note: GGUF files for Ollama use a simplified single-stack encoder. For full AXL quality, use the Python API server.
86
+
87
+ ### Recommendations
88
+
89
+ - Use for prototyping and experimentation, not production code generation.
90
+ - Byte-level perplexity (258 vocab) is not comparable to BPE-level perplexity (32K vocab).
91
+ - For better results, use the Lion-optimized version if available.
92
+
93
+ ## Training Details
94
+
95
+ ### Training Data
96
+
97
+ Retrained with Lion on 10MB chat pairs. 216 steps in 10 min. Covers code Q&A, general knowledge.
98
+
99
+ ### Preprocessing
100
+
101
+ Byte-level tokenization with vocabulary size 258 (256 bytes + BOS + EOS). No vocabulary training required.
102
+
103
+ ### Speeds, Sizes, Times
104
+
105
+ | Metric | Value |
106
+ |--------|-------|
107
+ | Training Steps | 216 |
108
+ | Training Time | 10 min |
109
+ | Final Loss | 0.3650 |
110
+
111
+ ## Evaluation
112
+
113
+ ### Metrics
114
+
115
+ Perplexity on held-out Python code using byte-level tokenization.
116
+
117
+ ### Results
118
+
119
+ | Metric | Value |
120
+ |--------|-------|
121
+ | Perplexity (byte-level) | 1.02 |
122
+ | Final Loss | 0.3650 |
123
+ | Training Steps | 216 |
124
+ | Training Time | 10 min |
125
+
126
+ **Summary:** Good for code explanation and Q&A.
127
+
128
+ ## Environmental Impact
129
+
130
+ | Property | Value |
131
+ |----------|-------|
132
+ | Hardware | AMD Ryzen 5 5600G |
133
+ | Hours Used | 0.167 |
134
+ | Carbon Emitted | 0.0070 kg CO2 |
135
+ | Cloud Provider | None (local CPU) |
136
+
137
+ ## Technical Specifications
138
+
139
+ ### Model Architecture
140
+
141
+ Multi-Scale Transformer with three parallel encoder stacks at resolution scales 1x, 2x, and 4x. Cross-scale attention connects all scale pairs. Adaptive gating fusion. SwiGLU feed-forward. RoPE positional encoding.
142
+
143
+ ### Compute Infrastructure
144
+
145
+ | Property | Value |
146
+ |----------|-------|
147
+ | Hardware | AMD Ryzen 5 5600G (6 cores, 12 threads) |
148
+ | RAM | 16 GB |
149
+ | GPU | None (CPU-only) |
150
+
151
+ ## Citation
152
+
153
+ ```bibtex
154
+ @misc{axl_2026,
155
+ title={AXL: AXL-Chat-10M - Multi-Scale Transformer for CPU Code Generation},
156
+ author={Koinic},
157
+ year={2026},
158
+ url={https://huggingface.co/KoinicLabs}
159
+ }
160
+ ```
161
+
162
+ ## How to Get Started
163
+
164
+ ### With Ollama
165
+
166
+ ```bash
167
+ ollama create axl-chat-10m -f Modelfile
168
+ ollama run axl-chat-10m "def fibonacci():"
169
+ ```
170
+
171
+ ### With Python
172
+
173
+ ```python
174
+ import torch
175
+ from multiscale_transformer.model.config import load_config
176
+ from multiscale_transformer.model.model import MultiScaleTransformer
177
+ from multiscale_transformer.training.tokenizer import ByteTokenizer
178
+ config = load_config("config.json")
179
+ model = MultiScaleTransformer(config)
180
+ ckpt = torch.load("axl_chat_10m.pt", map_location="cpu")
181
+ model.load_state_dict(ckpt["model_state_dict"])
182
+ model.eval()
183
+ tokenizer = ByteTokenizer()
184
+ prompt = "def fibonacci():"
185
+ ids = torch.tensor([tokenizer.encode(prompt)], dtype=torch.long)
186
+ with torch.no_grad():
187
+ out = model.generate(ids, max_new_tokens=100, temperature=0.8, top_k=40)
188
+ print(tokenizer.decode(out[0].tolist()))
189
+ ```
axl-chat-f16.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7b377d2e24aa0327006cc447a62caea359379b458cd675eb903756150ba4250e
3
+ size 19876000
axl-chat-q4_k_m.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a5a0ab72de91293666a218feadcf3d58ea07cb9952a71918c45edc66d0b13768
3
+ size 19876000
axl_chat-q4_k_m_v2.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5d1be192fd5f666337180abf388aae1bca8c967d5c12c6059b82f1d36bc2b89b
3
+ size 7020704
axl_chat.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:407e05b8091b22cf6b6961951cb67a28594da1ec7fd225bd2cb357514f78dc68
3
+ size 39557569
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "multiscale_transformer",
3
+ "architectures": [
4
+ "MultiScaleForCausalLM"
5
+ ],
6
+ "vocab_size": 258,
7
+ "d_model": 224,
8
+ "n_heads": 4,
9
+ "d_ff": 608,
10
+ "n_layers_per_scale": 3,
11
+ "n_cross_attn_layers": 1,
12
+ "max_seq_len": 512,
13
+ "dropout": 0.0,
14
+ "bias": false,
15
+ "rope_theta": 10000.0,
16
+ "downsample_factors": [
17
+ 1,
18
+ 2,
19
+ 4
20
+ ],
21
+ "num_parameters": 9259040,
22
+ "training_results": {
23
+ "model": "AXL-Chat-10M",
24
+ "params": 9870112,
25
+ "steps": 501,
26
+ "time": 180.17745065689087,
27
+ "final_loss": 0.005268150940537453,
28
+ "perplexity": 1.02,
29
+ "max_seq_len": 512,
30
+ "context_window": "512 bytes"
31
+ }
32
+ }
generation_config.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "max_new_tokens": 256,
3
+ "temperature": 0.8,
4
+ "top_k": 40,
5
+ "top_p": 0.9,
6
+ "repetition_penalty": 1.1,
7
+ "do_sample": true
8
+ }
index.html ADDED
@@ -0,0 +1,95 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width,initial-scale=1.0">
6
+ <title>AXL-Chat-10M - AXL</title>
7
+ <style>*{margin:0;padding:0;box-sizing:border-box}
8
+ body{font-family:-apple-system,BlinkMacSystemFont,Segoe UI,Roboto,sans-serif;background:#0d1117;color:#c9d1d9;line-height:1.6}
9
+ a{color:#58a6ff;text-decoration:none}a:hover{text-decoration:underline}
10
+ .hero{padding:40px 20px;text-align:center;border-bottom:1px solid #30363d;background:linear-gradient(135deg,#0d1117,#161b22,#0d1117)}
11
+ .hero h1{font-size:2.2rem;color:#fff;letter-spacing:-1px}
12
+ .cat{display:inline-block;padding:3px 12px;border-radius:12px;font-size:.75rem;font-weight:600;margin-bottom:12px}
13
+ .cat.Lion{background:#1f3a5f;color:#4285f4}
14
+ .cat.SGD{background:#3d1f1f;color:#f85149}
15
+ .cat.Specialized{background:#2d1b69;color:#bb86fc}
16
+ .desc{color:#8b949e;font-size:.95rem;max-width:600px;margin:12px auto 0}
17
+ .ms{display:flex;flex-wrap:wrap;gap:12px;justify-content:center;padding:24px 20px}
18
+ .mc{background:#161b22;border:1px solid #30363d;border-radius:10px;padding:16px 24px;text-align:center;min-width:120px}
19
+ .v{font-size:1.5rem;font-weight:700;color:#fff}.l{font-size:.75rem;color:#8b949e;margin-top:2px}
20
+ .tabs{max-width:800px;margin:0 auto;padding:0 20px}
21
+ .tabs>input[type=radio]{display:none}
22
+ .tl{display:inline-block;background:#21262d;border:1px solid #30363d;color:#8b949e;padding:7px 16px;border-radius:8px;cursor:pointer;font-size:.85rem;margin:0 4px 16px;transition:all .2s}
23
+ .tl:hover{background:#30363d;color:#c9d1d9}
24
+ .p{display:none;background:#161b22;border:1px solid #30363d;border-radius:12px;padding:24px;margin-bottom:24px}
25
+ #t1:checked~.p1,#t2:checked~.p2,#t3:checked~.p3,#t4:checked~.p4{display:block}
26
+ #t1:checked+label[for=t1],#t2:checked+label[for=t2],#t3:checked+label[for=t3],#t4:checked+label[for=t4]{background:#4285f4;color:#fff;border-color:#4285f4}
27
+ table{width:100%;border-collapse:collapse}
28
+ th{text-align:left;color:#8b949e;font-size:.8rem;padding:8px 12px;border-bottom:1px solid #21262d;font-weight:600}
29
+ td{padding:8px 12px;font-size:.9rem;border-bottom:1px solid #21262d}
30
+ pre{background:#0d1117;padding:14px;border-radius:8px;overflow-x:auto;margin:12px 0}
31
+ code{color:#c9d1d9;font-size:.82rem;line-height:1.5}
32
+ .note{background:#21262d;border-left:3px solid #4285f4;padding:12px 16px;border-radius:0 8px 8px 0;margin:12px 0;font-size:.85rem;color:#8b949e}
33
+ .story{font-size:.9rem;color:#8b949e;line-height:1.6;margin:8px 0}
34
+ .back{text-align:center;padding:24px 20px 40px}
35
+ .back a{color:#58a6ff;font-size:.9rem}
36
+ @media(max-width:768px){.hero h1{font-size:1.6rem}.ms{flex-direction:column;align-items:center}.mc{min-width:200px}}</style>
37
+ </head>
38
+ <body>
39
+ <div class="hero">
40
+ <div class="cat Lion">Lion Optimized</div>
41
+ <h1>AXL-Chat-10M</h1>
42
+ <p class="desc">Conversational AI. 9.9M params. PPL 1.48. Context 2048 bytes.</p>
43
+ </div>
44
+ <div class="ms">
45
+ <div class="mc"><div class="v">10M</div><div class="l">Parameters</div></div>
46
+ <div class="mc"><div class="v">1.48</div><div class="l">Perplexity</div></div>
47
+ <div class="mc"><div class="v">10 min</div><div class="l">Training</div></div>
48
+ <div class="mc"><div class="v">20 MB</div><div class="l">GGUF</div></div>
49
+ </div>
50
+ <div class="tabs">
51
+ <input type="radio" name="t" id="t1" checked><label for="t1" class="tl">Specs</label>
52
+ <input type="radio" name="t" id="t2"><label for="t2" class="tl">Training</label>
53
+ <input type="radio" name="t" id="t3"><label for="t3" class="tl">Usage</label>
54
+ <input type="radio" name="t" id="t4"><label for="t4" class="tl">Download</label>
55
+ <div class="p p1">
56
+ <table>
57
+ <tr><th>Property</th><th>Value</th></tr>
58
+ <tr><td>Architecture</td><td>Multi-Scale Transformer</td></tr>
59
+ <tr><td>d_model</td><td>?</td></tr>
60
+ <tr><td>Attention Heads</td><td>?</td></tr>
61
+ <tr><td>Layers per Scale</td><td>?</td></tr>
62
+ <tr><td>Context Window</td><td>2048 bytes</td></tr>
63
+ <tr><td>Downsample Factors</td><td>[1, 2, 4]</td></tr>
64
+ <tr><td>Vocab Size</td><td>258 (byte-level)</td></tr>
65
+ <tr><td>Optimizer</td><td>Lion</td></tr>
66
+ </table>
67
+ </div>
68
+ <div class="p p2">
69
+ <div class="story">Retrained with Lion on 10MB chat pairs. 216 steps in 10 min. Covers code Q&A, general knowledge.</div>
70
+ <table>
71
+ <tr><th>Metric</th><th>Value</th></tr>
72
+ <tr><td>Final Loss</td><td>0.3650</td></tr>
73
+ <tr><td>Perplexity</td><td>1.48</td></tr>
74
+ <tr><td>Training Steps</td><td>216</td></tr>
75
+ <tr><td>Training Time</td><td>10 min</td></tr>
76
+ </table>
77
+ </div>
78
+ <div class="p p3">
79
+ <h3 style="color:#fff;margin-bottom:12px">Usage</h3>
80
+ <pre><code>ollama create axl-chat-10m -f Modelfile
81
+ ollama run axl-chat-10m "def fibonacci():"</code></pre>
82
+ <div class="note">Good for code explanation and Q&A.</div>
83
+ </div>
84
+ <div class="p p4">
85
+ <table>
86
+ <tr><th>File</th><th>Size</th><th>Format</th></tr>
87
+ <tr><td>F16 GGUF</td><td>20 MB</td><td>Full precision</td></tr>
88
+ <tr><td>Q4_K_M GGUF</td><td>20 MB</td><td>4-bit quantized</td></tr>
89
+ </table>
90
+ <div class="note" style="margin-top:16px">GGUF files work with Ollama and llama.cpp. Q4_K_M is about 3x smaller than F16.</div>
91
+ </div>
92
+ </div>
93
+ <div class="back"><a href="../">← All AXL Models</a></div>
94
+ </body>
95
+ </html>
results.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model": "AXL-Chat-10M",
3
+ "params": 9870112,
4
+ "steps": 216,
5
+ "time": 601.3559806346893,
6
+ "final_loss": 0.3649772107601166,
7
+ "perplexity": 1.48,
8
+ "optimizer": "Lion",
9
+ "max_seq_len": 2048
10
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "pad_token": "[PAD]",
3
+ "bos_token": "[BOS]",
4
+ "eos_token": "[EOS]",
5
+ "unk_token": "[UNK]"
6
+ }
tokenizer_config.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "tokenizer_class": "ByteTokenizer",
3
+ "vocab_size": 258,
4
+ "pad_token": "[PAD]",
5
+ "bos_token": "[BOS]",
6
+ "eos_token": "[EOS]",
7
+ "unk_token": "[UNK]"
8
+ }