Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
wayyresearch
/
aetheris
like
2
Follow
Wayy Research Co.
2
Text Generation
PyTorch
65 languages
mamba
ssm
state-space-model
mixture-of-experts
Mixture of Experts
multilingual
distillation
knowledge-distillation
aya
hybrid-architecture
wayy-research
arxiv:
2312.00752
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
56a347e
aetheris
5.97 GB
Ctrl+K
Ctrl+K
1 contributor
History:
35 commits
rcgalbo
Upload Stage 2 final checkpoint (step 20000)
56a347e
verified
about 1 month ago
.gitattributes
Safe
1.52 kB
initial commit
about 1 month ago
README.md
Safe
1.89 kB
Stage 2 checkpoint: [Step 18500/20000] loss=3.1250
about 1 month ago
stage1_checkpoint.pt
Suspicious
pickle
Detected Pickle imports (4)
"torch.BFloat16Storage"
,
"collections.OrderedDict"
,
"torch._utils._rebuild_tensor_v2"
,
"torch.FloatStorage"
What is a pickle import?
1.64 GB
xet
Stage 1 checkpoint: [Step 50/20000] loss=7.7500
about 1 month ago
stage1_metadata.json
Safe
414 Bytes
Stage 1 checkpoint: [Step 50/20000] loss=7.7500
about 1 month ago
stage2_best.pt
pickle
Detected Pickle imports (3)
"collections.OrderedDict"
,
"torch.BFloat16Storage"
,
"torch._utils._rebuild_tensor_v2"
What is a pickle import?
1.44 GB
xet
Upload final Stage 2 best checkpoint (loss=2.7305, 20K steps)
about 1 month ago
stage2_checkpoint.pt
Suspicious
pickle
Detected Pickle imports (3)
"collections.OrderedDict"
,
"torch._utils._rebuild_tensor_v2"
,
"torch.BFloat16Storage"
What is a pickle import?
1.44 GB
xet
Stage 2 checkpoint: [Step 18500/20000] loss=3.1250
about 1 month ago
stage2_final.pt
Safe
pickle
Detected Pickle imports (3)
"collections.OrderedDict"
,
"torch._utils._rebuild_tensor_v2"
,
"torch.BFloat16Storage"
What is a pickle import?
1.44 GB
xet
Upload Stage 2 final checkpoint (step 20000)
about 1 month ago
stage2_metadata.json
Safe
298 Bytes
Stage 2 checkpoint: [Step 18500/20000] loss=3.1250
about 1 month ago
student_config.yaml
Safe
668 Bytes
Stage 1 initial: step 1000, loss=0.29, cka=0.60
about 1 month ago
training_config.yaml
Safe
2.74 kB
Stage 1 initial: step 1000, loss=0.29, cka=0.60
about 1 month ago