Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
yashmarathe
/
avey-d-moe-1b
like
0
Text Generation
Transformers
Safetensors
English
avey-d-moe
causal-lm
mixture-of-experts
attention-free
avey
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
avey-d-moe-1b
4.04 GB
Ctrl+K
Ctrl+K
1 contributor
History:
4 commits
yashmarathe
fix: Update modeling code with post_init() for transformers 5.2.0
2de1427
verified
about 2 months ago
.gitattributes
Safe
1.52 kB
initial commit
about 2 months ago
README.md
2.84 kB
Upload Avey-D MoE 1B (1.01B total, 205M active, trained on FineWeb 1.3B tokens)
about 2 months ago
config.json
585 Bytes
Upload Avey-D MoE 1B (1.01B total, 205M active, trained on FineWeb 1.3B tokens)
about 2 months ago
configuration.py
9.56 kB
Upload Avey-D MoE 1B (1.01B total, 205M active, trained on FineWeb 1.3B tokens)
about 2 months ago
model.safetensors
4.04 GB
xet
Upload Avey-D MoE 1B (1.01B total, 205M active, trained on FineWeb 1.3B tokens)
about 2 months ago
modeling_dense.py
15.6 kB
fix: Update modeling code with post_init() for transformers 5.2.0
about 2 months ago
modeling_moe.py
9.69 kB
fix: Update modeling code with post_init() for transformers 5.2.0
about 2 months ago
moe.py
20.2 kB
Upload Avey-D MoE 1B (1.01B total, 205M active, trained on FineWeb 1.3B tokens)
about 2 months ago
tokenizer.json
6.31 MB
Upload Avey-D MoE 1B (1.01B total, 205M active, trained on FineWeb 1.3B tokens)
about 2 months ago
tokenizer_config.json
354 Bytes
Upload Avey-D MoE 1B (1.01B total, 205M active, trained on FineWeb 1.3B tokens)
about 2 months ago