anthonym21
/

Eve-2-MoE-IT-272M

Text Generation

Mixture of Experts

instruction-tuning

Model card Files Files and versions

Eve-2-MoE-IT-272M

2.18 GB

Ctrl+K

Ctrl+K

1 contributor

History: 30 commits

anthonym21's picture

perf: use torch.compile max-autotune mode

9289ac8 verified about 1 month ago

.gitattributes

1.57 kB
Add Eve-2 swarm logo about 2 months ago
README.md

7.93 kB
Restore full README with training history, swarm table, and specialist status about 1 month ago
config.json

636 Bytes
Add config.json from base model about 1 month ago
configuration_eve.py

2.78 kB
Add configuration_eve.py from base model about 1 month ago
eve-2-swarm.jpg

160 kB
xet

Add Eve-2 swarm logo about 2 months ago
generate.py

3.48 kB
Add generate.py from base model about 1 month ago
generation_config.json

164 Bytes
Add generation_config.json from base model about 1 month ago
model.safetensors

1.09 GB
xet

Add instruction-tuned model in safetensors format about 1 month ago
modeling_eve.py

18.1 kB
perf: remove CPU-GPU sync bottleneck in SharedMoE routing loop about 1 month ago
push_to_hub.py

435 Bytes
Upload folder using huggingface_hub about 2 months ago
pytorch_model.bin
Detected Pickle imports (3)
- "torch._utils._rebuild_tensor_v2",
- "collections.OrderedDict",
- "torch.FloatStorage"
What is a pickle import?
1.09 GB
xet

Add instruction-tuned weights (3 epochs on alpaca-cleaned) about 1 month ago
tokenizer.json

3.56 MB
Eve-2-MoE-IT-272M: heavy IT patch (open-perfectblend, LoRA r=128, merged) about 2 months ago
tokenizer_config.json

297 Bytes
Upload folder using huggingface_hub about 2 months ago
train.py

17.2 kB
perf: use torch.compile max-autotune mode about 1 month ago