How to use prdev/gpt2-differential-linear-attention with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("prdev/gpt2-differential-linear-attention", dtype="auto")