Advaith28
/

Linear_pruned_RedPajama3B

Text Generation

text-generation-inference

8-bit precision

Model card Files Files and versions

Linear_pruned_RedPajama3B / README.md

Advaith28's picture

Create README.md

5ea922b about 2 years ago

|

history blame contribute delete

246 Bytes

	33% pruning on RedPajama 3B linear layers

	The pruned layers are:
	1. attention linear layers (query, key, value computation)
	2. attention dense layer
	3. MLP layers

	Pruning is done in all decoder modules. Pruning is unstructured magnitude pruning