| 33% pruning on RedPajama 3B linear layers | |
| The pruned layers are: | |
| 1. attention linear layers (query, key, value computation) | |
| 2. attention dense layer | |
| 3. MLP layers | |
| Pruning is done in all decoder modules. Pruning is unstructured magnitude pruning |
| 33% pruning on RedPajama 3B linear layers | |
| The pruned layers are: | |
| 1. attention linear layers (query, key, value computation) | |
| 2. attention dense layer | |
| 3. MLP layers | |
| Pruning is done in all decoder modules. Pruning is unstructured magnitude pruning |