Advaith28
/

Linear_pruned_RedPajama3B

Text Generation

text-generation-inference

8-bit precision

Model card Files Files and versions

Linear_pruned_RedPajama3B / README.md

Advaith28's picture

Create README.md

5ea922b about 2 years ago

|

history blame contribute delete

246 Bytes

33% pruning on RedPajama 3B linear layers

The pruned layers are:

attention linear layers (query, key, value computation)
attention dense layer
MLP layers

Pruning is done in all decoder modules. Pruning is unstructured magnitude pruning