How to use from the
Use from the
Transformers library
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Advaith28/Linear_pruned_RedPajama3B")
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Advaith28/Linear_pruned_RedPajama3B")
model = AutoModelForCausalLM.from_pretrained("Advaith28/Linear_pruned_RedPajama3B")
Quick Links

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

33% pruning on RedPajama 3B linear layers

The pruned layers are:

  1. attention linear layers (query, key, value computation)
  2. attention dense layer
  3. MLP layers

Pruning is done in all decoder modules. Pruning is unstructured magnitude pruning

Downloads last month
6
Safetensors
Model size
3B params
Tensor type
F32
F16
I8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support