DistillLens-gpt2-base

paper | code

DistillLens-gpt2-base is a gpt2-base (120M) model distilled from gpt2-xlarge (1.5B) on databricks-dolly-15k.

Method

Note: DistillLens requires a GPT2-Base for initilization to perform the distillation.

Citation

@article{dhakal2026distilllens,
  title={DistillLens: Symmetric Knowledge Distillation Through Logit Lens},
  author={Dhakal, Manish and Jinadu, Uthman and Budathoki, Anjila and Sunderraman, Rajshekhar and Ding, Yi},
  journal={arXiv preprint arXiv:2602.13567},
  year={2026}
}
Downloads last month
12
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for manishdhakal/DistillLens-gpt2-base

Finetuned
(2088)
this model
Quantizations
1 model

Dataset used to train manishdhakal/DistillLens-gpt2-base

Collection including manishdhakal/DistillLens-gpt2-base

Paper for manishdhakal/DistillLens-gpt2-base