Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

AxionLab-Co
/
AxionMoE-350k-A250k

Text Generation
Transformers
Safetensors
English
deepseek_nano
math
experiment
Mixture of Experts
deepseek
from-scratch
tiny-model
cpu
deepseek-v3-architecture
custom_code
Model card Files Files and versions
xet
Community
AxionMoE-350k-A250k
1.45 MB
  • 1 contributor
History: 21 commits
AxionLab-official's picture
AxionLab-official
Update README.md
3f7e5c2 verified 7 days ago
  • .gitattributes
    1.52 kB
    initial commit 7 days ago
  • README.md
    5.33 kB
    Update README.md 7 days ago
  • config.json
    1.02 kB
    Update config.json 7 days ago
  • model.model
    8.33 kB
    xet
    Upload 4 files 7 days ago
  • model.safetensors
    1.39 MB
    xet
    Upload 4 files 7 days ago
  • model.vocab
    40.5 kB
    Upload 4 files 7 days ago
  • modeling_axion.py
    4.88 kB
    Update modeling_axion.py 7 days ago