YAML Metadata Warning: empty or missing yaml metadata in repo card
Check out the documentation for more information.
ss_d4096_f0.0039
Weight-sparse transformer trained with the procedure from Gao et al. (2025).
Model Details
- Layers: 4
- Model Dimension: 4096
- Context Length: 512
- Head Dimension: 16
- Vocabulary Size: 4096
Sparsity
- Weight Sparsity: True
- Target L0 Fraction: 0.0039
- Activation Sparsity: True
Training
- Dataset: SimpleStories/SimpleStories
- Tokenizer: SimpleStories/SimpleStories-1.25M
- Total Tokens: 2,000,000,000
Training Run
Usage
import torch
from huggingface_hub import hf_hub_download
# Download model
model_path = hf_hub_download(repo_id="jacobcd52/ss_d4096_f0.0039", filename="pytorch_model.bin")
config_path = hf_hub_download(repo_id="jacobcd52/ss_d4096_f0.0039", filename="config.json")
# Load (requires the SparseGPT model class from this repo)
state_dict = torch.load(model_path)
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support