Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
AxionLab-Co
/
AxionMoE-350k-A250k
like
1
Follow
AxionLaboratory Researches Co.
4
Text Generation
Transformers
Safetensors
openai/gsm8k
English
deepseek_nano
math
experiment
Mixture of Experts
deepseek
from-scratch
tiny-model
cpu
deepseek-v3-architecture
custom_code
arxiv:
2412.19437
License:
mit
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
AxionMoE-350k-A250k
1.45 MB
1 contributor
History:
21 commits
AxionLab-official
Update README.md
3f7e5c2
verified
7 days ago
.gitattributes
Safe
1.52 kB
initial commit
7 days ago
README.md
5.33 kB
Update README.md
7 days ago
config.json
1.02 kB
Update config.json
7 days ago
model.model
8.33 kB
xet
Upload 4 files
7 days ago
model.safetensors
1.39 MB
xet
Upload 4 files
7 days ago
model.vocab
40.5 kB
Upload 4 files
7 days ago
modeling_axion.py
4.88 kB
Update modeling_axion.py
7 days ago