Text Generation
Portuguese
AxionLab-official commited on
Commit
b8a046a
ยท
verified ยท
1 Parent(s): d13323d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +196 -3
README.md CHANGED
@@ -1,3 +1,196 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - pt
5
+ ---
6
+
7
+ **๐Ÿง  MiniAxion1.5-3M**
8
+
9
+ **Emergent reasoning in a 2.7M parameter model.
10
+ A tiny Portuguese-first language model that learns how to think before it learns how to be correct.**
11
+
12
+ **๐Ÿš€ Overview**
13
+
14
+ MiniAxion1.5-3M is an ultra-compact (~2.7M parameters) GPT-style language model designed to investigate reasoning emergence at extreme small scale.
15
+
16
+ Unlike typical small models optimized for fluency, MiniAxion is explicitly trained to produce:
17
+
18
+ Structured reasoning traces
19
+ Step-by-step thinking (<THINK><STEP>)
20
+ Deterministic answer formatting
21
+
22
+ It operates primarily in Portuguese, making it a rare example of a non-English reasoning-first nano model.
23
+
24
+ **โšก Why This Model Is Interesting**
25
+
26
+ Most models follow this trajectory:
27
+
28
+ Language โ†’ Knowledge โ†’ Reasoning
29
+
30
+ MiniAxion flips part of that:
31
+
32
+ Structure โ†’ Reasoning format โ†’ (still learning correctness)
33
+
34
+ **๐Ÿ’ก Key insight:**
35
+
36
+ The model demonstrates that reasoning structure can emerge independently of reasoning accuracy.
37
+
38
+ **๐Ÿงช Evaluation**
39
+ Task Performance
40
+ Task Accuracy
41
+ Addition 10%
42
+ Subtraction 10%
43
+ Multiplication 0%
44
+ Even/Odd 100%
45
+ Comparison 5%
46
+ Sequence Completion 0%
47
+ Word Problems (Addition) 10%
48
+ Word Problems (Subtraction) 0%
49
+ Word Problems (Multiplication) 10%
50
+ True/False 100%
51
+ Chat/Greetings 100%
52
+
53
+ **๐Ÿง  Reasoning Behavior Metrics**
54
+ Metric Score
55
+ Thinking Rate 100%
56
+ Step Format 100%
57
+ Answer Completion 100%
58
+
59
+ โœ” The model always thinks
60
+ โœ” The model always structures reasoning
61
+ โœ” The model always produces an answer
62
+
63
+ **๐Ÿ“Š Interpretation**
64
+
65
+ MiniAxion exhibits a clear dissociation:
66
+
67
+ โœ… What it learned
68
+ Reasoning format
69
+ Step-by-step decomposition
70
+ Logical task patterns (parity, boolean)
71
+ โŒ What it did NOT learn
72
+ Arithmetic correctness
73
+ Numerical reasoning
74
+ Multi-step computation
75
+
76
+ **๐Ÿ”ฌ Core Finding**
77
+
78
+ Reasoning โ‰  Correctness
79
+
80
+ MiniAxion shows that:
81
+
82
+ Models can internalize thinking patterns
83
+ Without actually learning how to solve problems
84
+
85
+ This makes it a strong candidate for studying:
86
+
87
+ Emergent reasoning
88
+ Tiny Recursive Models (TRMs)
89
+ Reasoning distillation
90
+
91
+ **๐Ÿ—๏ธ Architecture**
92
+ Type: GPT-style Transformer
93
+ Parameters: ~2.7M
94
+ Objective: Next-token prediction
95
+ Language: Portuguese (primary)
96
+ Specialization: Structured reasoning traces
97
+
98
+ **๐Ÿง  Training Strategy**
99
+
100
+ The model was trained with a reasoning-first approach:
101
+
102
+ Portuguese language grounding
103
+ Structured reasoning data (<THINK><STEP>)
104
+ Emphasis on:
105
+ Deterministic formats
106
+ Multi-step thinking
107
+ Explicit reasoning tokens
108
+
109
+ ๐Ÿšซ No RLHF
110
+ ๐Ÿšซ No instruction tuning at scale
111
+ ๐Ÿšซ No large model distillation (yet)
112
+
113
+ โš ๏ธ Limitations
114
+ 1. Arithmetic Collapse
115
+
116
+ Near-random performance in:
117
+
118
+ Addition
119
+
120
+ Subtraction
121
+
122
+ Multiplication
123
+
124
+ โ†’ Indicates lack of numerical representation learning
125
+
126
+ Strong dependence on:
127
+
128
+ Prompt format
129
+
130
+ Token patterns
131
+
132
+ Seen reasoning templates
133
+
134
+ **๐Ÿ”ฎ Future Work**
135
+
136
+ This model is just the beginning.
137
+
138
+ ๐Ÿ“ˆ Scaling
139
+
140
+ 5M / 10M / 20M versions
141
+
142
+ Track emergence of correctness
143
+
144
+ ๐Ÿงช Distillation
145
+
146
+ Inject reasoning from larger models
147
+
148
+ Improve accuracy without scaling params
149
+
150
+ ๐Ÿ” Self-Play / Synthetic Data
151
+
152
+ Generate reasoning loops
153
+
154
+ Reinforce correct chains
155
+
156
+ ๐Ÿงฉ Hybrid Reasoning
157
+
158
+ Combine symbolic + neural learning
159
+
160
+ Fix arithmetic weakness
161
+
162
+ ๐Ÿงพ Example Output
163
+
164
+ <THINK>
165
+ <STEP> Identifico os nรบmeros
166
+ <STEP> Tento somar os valores
167
+ <STEP> Ajusto o resultado
168
+ </THINK>
169
+ <ANSWER> 74 </ANSWER>
170
+
171
+ โœ” Perfect reasoning structure
172
+ โŒ Incorrect answer
173
+
174
+ **๐Ÿ’ก Takeaway**
175
+
176
+ MiniAxion1.5-3M proves something important:
177
+
178
+ Even a 2.7M model can learn to simulate thinking before it learns to actually think correctly.
179
+
180
+ **๐Ÿค Use Cases**
181
+
182
+ Research on emergent reasoning
183
+
184
+ Tiny model experimentation (CPU-friendly)
185
+
186
+ Educational demos of:
187
+
188
+ Chain-of-Thought
189
+
190
+ Reasoning failure modes
191
+
192
+ Base model for:
193
+
194
+ Distillation
195
+
196
+ NRM experiments