33% pruning on RedPajama 3B linear layers

The pruned layers are:
1. attention linear layers (query, key, value computation)
2. attention dense layer
3. MLP layers

Pruning is done in all decoder modules. Pruning is unstructured magnitude pruning