yashmarathe
/

avey-d-moe-1b

Text Generation

mixture-of-experts

Model card Files Files and versions

4.04 GB

Ctrl+K

Ctrl+K

1 contributor

History: 4 commits

yashmarathe's picture

fix: Update modeling code with post_init() for transformers 5.2.0

2de1427 verified about 2 months ago

.gitattributes

1.52 kB
initial commit about 2 months ago
README.md

2.84 kB
Upload Avey-D MoE 1B (1.01B total, 205M active, trained on FineWeb 1.3B tokens) about 2 months ago
config.json

585 Bytes
Upload Avey-D MoE 1B (1.01B total, 205M active, trained on FineWeb 1.3B tokens) about 2 months ago
configuration.py

9.56 kB
Upload Avey-D MoE 1B (1.01B total, 205M active, trained on FineWeb 1.3B tokens) about 2 months ago
model.safetensors

4.04 GB
xet

Upload Avey-D MoE 1B (1.01B total, 205M active, trained on FineWeb 1.3B tokens) about 2 months ago
modeling_dense.py

15.6 kB
fix: Update modeling code with post_init() for transformers 5.2.0 about 2 months ago
modeling_moe.py

9.69 kB
fix: Update modeling code with post_init() for transformers 5.2.0 about 2 months ago
moe.py

20.2 kB
Upload Avey-D MoE 1B (1.01B total, 205M active, trained on FineWeb 1.3B tokens) about 2 months ago
tokenizer.json

6.31 MB
Upload Avey-D MoE 1B (1.01B total, 205M active, trained on FineWeb 1.3B tokens) about 2 months ago
tokenizer_config.json

354 Bytes
Upload Avey-D MoE 1B (1.01B total, 205M active, trained on FineWeb 1.3B tokens) about 2 months ago