Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
DevHunterAI
/
HSSM-v2-250M
like
1
Text Generation
PyTorch
HuggingFaceFW/fineweb-edu
English
hssm-v2
hierarchical-state-space-model
mixture-of-experts
autoregressive
fineweb-edu
250m-parameters
0.25B
Model card
Files
Files and versions
xet
Community
main
HSSM-v2-250M
1 GB
Ctrl+K
Ctrl+K
1 contributor
History:
8 commits
DevHunterAI
Update README.md
e3e74c0
verified
7 days ago
.gitattributes
1.58 kB
Upload HSSM_v2_architecture.png with huggingface_hub
7 days ago
HSSM_v2_architecture.png
728 kB
xet
Upload HSSM_v2_architecture.png with huggingface_hub
7 days ago
README.md
6.04 kB
Update README.md
7 days ago
hssm_pretrained_chat.py
27.8 kB
Upload hssm_pretrained_chat.py with huggingface_hub
7 days ago
hssm_v2_250m_fineweb_edu_final.pt
pickle
Detected Pickle imports (3)
"torch.FloatStorage"
,
"collections.OrderedDict"
,
"torch._utils._rebuild_tensor_v2"
What is a pickle import?
1 GB
xet
Upload hssm_v2_250m_fineweb_edu_final.pt with huggingface_hub
7 days ago
hssm_v2_gpu_pretrain.py
17 kB
Upload hssm_v2_gpu_pretrain.py with huggingface_hub
7 days ago