Advaith28's picture
Create README.md
5ea922b

33% pruning on RedPajama 3B linear layers

The pruned layers are:

  1. attention linear layers (query, key, value computation)
  2. attention dense layer
  3. MLP layers

Pruning is done in all decoder modules. Pruning is unstructured magnitude pruning