File size: 845 Bytes
9dc7b42
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# ss_dense

Weight-sparse transformer trained with the procedure from Gao et al. (2025).

## Model Details

- **Layers**: 4
- **Model Dimension**: 512
- **Context Length**: 512
- **Head Dimension**: 16
- **Vocabulary Size**: 4096

## Sparsity

- **Weight Sparsity**: False
- **Target L0 Fraction**: 1
- **Activation Sparsity**: False

## Training

- **Dataset**: SimpleStories/SimpleStories
- **Tokenizer**: SimpleStories/SimpleStories-1.25M
- **Total Tokens**: 2,000,000,000

## Usage

```python
import torch
from huggingface_hub import hf_hub_download

# Download model
model_path = hf_hub_download(repo_id="jacobcd52/ss_dense", filename="pytorch_model.bin")
config_path = hf_hub_download(repo_id="jacobcd52/ss_dense", filename="config.json")

# Load (requires the SparseGPT model class from this repo)
state_dict = torch.load(model_path)
```