Text Generation
Transformers
PyTorch
Safetensors
English
Chinese
llama
text-generation-inference

MiniLoong-3B

📑 arXiv | 👻 GitHub | 🤗 HuggingFace-MiniMA-3B | 🤗 HuggingFace-MiniChat-3B | 🤖 ModelScope-MiniMA-3B | 🤖 ModelScope-MiniChat-3B | 🤗 HuggingFace-MiniChat-1.5-3B | 🤗 HuggingFace-MiniMA-2-3B | 🤗 HuggingFace-MiniChat-2-3B | 🤗 HuggingFace-MiniMA-2-1B | 🤗 HuggingFace-MiniLoong-3B | 🤗 HuggingFace-MiniMix-2/4x3B

❗ Must comply with LICENSE of LLaMA-2 since it is derived from LLaMA-2.

teaser_d

Bibtex

@article{zhang2023law,
    title={Towards the Law of Capacity Gap in Distilling Language Models},
    author={Zhang, Chen and Song, Dawei and Ye, Zheyu and Gao, Yan},
    year={2023},
    url={https://arxiv.org/abs/2311.07052}
}
Downloads last month
6
Safetensors
Model size
3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for GeneZC/MiniLoong-3B

Quantizations
1 model

Datasets used to train GeneZC/MiniLoong-3B

Space using GeneZC/MiniLoong-3B 1

Collection including GeneZC/MiniLoong-3B

Paper for GeneZC/MiniLoong-3B