Model Card for TokAlign-Pythia-1b-Distill-Qwen-2-7b
The model, initialized from TokAlign-Pythia-1b-Qwen2-Tokenizer, is token-level distilled from Qwen2-7B.
Code
The code used to train this model refers to the github repo.
Citation
@inproceedings{li-etal-2025-TokAlign,
author = {Chong Li and
Jiajun Zhang and
Chengqing Zong},
title = "TokAlign: Efficient Vocabulary Adaptation via Token Alignment",
booktitle = "Proceedings of the 63nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
year = "2025",
address = "Vienna, Austria",
publisher = "Association for Computational Linguistics",
}
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support