|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- pt |
|
|
library_name: transformers |
|
|
--- |
|
|
|
|
|
**Axion1.5-0.3B-Base** |
|
|
|
|
|
🧠 Axion1.5-0.3B-Base is a base language model with approximately 300 million parameters, trained purely for next-token prediction. |
|
|
|
|
|
No instruction tuning. |
|
|
No reinforcement learning. |
|
|
No forced reasoning chains. |
|
|
|
|
|
Just raw language modeling. |
|
|
|
|
|
This model exists with a clear goal: to act as a clean, transparent baseline for future experiments focused on explicit reasoning and structured thinking. |
|
|
|
|
|
What this model is |
|
|
|
|
|
A foundation / base model |
|
|
|
|
|
Trained only with next-token prediction |
|
|
|
|
|
Not optimized for chat or instruction-following |
|
|
|
|
|
Designed as a reference point for research and comparison |
|
|
|
|
|
A functional “blank mind” before reasoning specialization |
|
|
|
|
|
What this model is not |
|
|
|
|
|
❌ **Not a chatbot** |
|
|
|
|
|
❌ Not instruction-tuned |
|
|
|
|
|
❌ Not aligned for safety or helpfulness |
|
|
|
|
|
❌ Not optimized for long conversations |
|
|
|
|
|
❌ Not a reasoning model (yet) |
|
|
|
|
|
If you are looking for a model that follows instructions or explains its thoughts, this is not it. |
|
|
|
|
|
Why release a base model? |
|
|
|
|
|
Releasing the base model publicly allows: |
|
|
|
|
|
Transparent evaluation of raw language modeling quality |
|
|
|
|
|
Fair comparison with future Axion reasoning variants |
|
|
|
|
|
Reproducibility and honest benchmarking |
|
|
|
|
|
A clear separation between language competence and reasoning behavior |
|
|
|
|
|
Many projects hide their base models. |
|
|
Axion does the opposite. |
|
|
|
|
|
**Intended use** |
|
|
|
|
|
Research and experimentation |
|
|
|
|
|
Fine-tuning for instruction-following or reasoning tasks |
|
|
|
|
|
Studying the effects of reasoning-oriented datasets |
|
|
|
|
|
Serving as a backbone for Axion1.5-Reasoning variants |
|
|
|
|
|
**Limitations** |
|
|
|
|
|
Because this model is trained only for next-token prediction: |
|
|
|
|
|
It may produce incoherent or incomplete responses |
|
|
|
|
|
It does not reliably follow instructions |
|
|
|
|
|
It does not reason step-by-step |
|
|
|
|
|
It may hallucinate or contradict itself |
|
|
|
|
|
These limitations are expected and acknowledged. |
|
|
|
|
|
****Future work** |
|
|
|
|
|
This release is part of a broader project: |
|
|
|
|
|
Axion1.5-Reasoning – fine-tuned for structured reasoning |
|
|
|
|
|
Axion-Critic – models focused on evaluation and self-critique |
|
|
|
|
|
Experiments with short, verifiable reasoning traces |
|
|
|
|
|
The base model will remain unchanged to preserve its value as a reference. |
|
|
|
|
|
**Philosophy** |
|
|
|
|
|
Scale is not intelligence. |
|
|
Structure matters. |
|
|
|
|
|
Axion explores whether smaller models, trained with the right constraints, can develop more meaningful reasoning behaviors. |
|
|
|
|
|
This is an experiment. |
|
|
And experiments are allowed to fail. |
|
|
|
|
|
**Acknowledgements** |
|
|
|
|
|
Created as an independent research project focused on understanding how reasoning emerges in language models. |