Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
wayyresearch
/
aetheris
like
2
Follow
Wayy Research Co.
2
Text Generation
14 languages
multilingual
mamba
Mixture of Experts
distillation
aya
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
main
aetheris
8.86 GB
1 contributor
History:
38 commits
rcgalbo
Upload Aetheris model with source code (Stage 2, 722M params, loss=2.73)
3bfe5e4
verified
about 2 hours ago
aetheris
Upload Aetheris model (Stage 2 best, 722M params, loss=2.73)
about 2 hours ago
.gitattributes
Safe
1.52 kB
initial commit
about 22 hours ago
README.md
1.41 kB
Upload Aetheris model (Stage 2 best, 722M params, loss=2.73)
about 2 hours ago
config.yaml
316 Bytes
Upload Aetheris model (Stage 2 best, 722M params, loss=2.73)
about 2 hours ago
pytorch_model.pt
pickle
Detected Pickle imports (3)
"collections.OrderedDict"
,
"torch._utils._rebuild_tensor_v2"
,
"torch.FloatStorage"
What is a pickle import?
2.89 GB
xet
Upload Aetheris model (Stage 2 best, 722M params, loss=2.73)
about 2 hours ago
stage1_checkpoint.pt
pickle
Detected Pickle imports (4)
"torch.BFloat16Storage"
,
"collections.OrderedDict"
,
"torch._utils._rebuild_tensor_v2"
,
"torch.FloatStorage"
What is a pickle import?
1.64 GB
xet
Stage 1 checkpoint: [Step 50/20000] loss=7.7500
about 20 hours ago
stage1_metadata.json
414 Bytes
Stage 1 checkpoint: [Step 50/20000] loss=7.7500
about 20 hours ago
stage2_best.pt
pickle
Detected Pickle imports (3)
"collections.OrderedDict"
,
"torch.BFloat16Storage"
,
"torch._utils._rebuild_tensor_v2"
What is a pickle import?
1.44 GB
xet
Upload final Stage 2 best checkpoint (loss=2.7305, 20K steps)
about 3 hours ago
stage2_checkpoint.pt
pickle
Detected Pickle imports (3)
"collections.OrderedDict"
,
"torch._utils._rebuild_tensor_v2"
,
"torch.BFloat16Storage"
What is a pickle import?
1.44 GB
xet
Stage 2 checkpoint: [Step 18500/20000] loss=3.1250
about 17 hours ago
stage2_final.pt
pickle
Detected Pickle imports (3)
"collections.OrderedDict"
,
"torch._utils._rebuild_tensor_v2"
,
"torch.BFloat16Storage"
What is a pickle import?
1.44 GB
xet
Upload Stage 2 final checkpoint (step 20000)
about 3 hours ago
stage2_metadata.json
263 Bytes
Update Stage 2 metadata: COMPLETE, best loss=2.7305
about 3 hours ago
student_config.yaml
668 Bytes
Stage 1 initial: step 1000, loss=0.29, cka=0.60
about 22 hours ago
training_config.yaml
2.74 kB
Stage 1 initial: step 1000, loss=0.29, cka=0.60
about 22 hours ago