Model Card for TokAlign-Pythia-6.9b-Distill-LLaMA3-8b

The model, initialized from TokAlign-Pythia-6.9b-LLaMA3-Tokenizer, is token-level distilled from LLaMA3-8b.

Code

The code used to train this model refers to the github repo.

Citation

@inproceedings{li-etal-2025-TokAlign,
  author    = {Chong Li and
               Jiajun Zhang and
               Chengqing Zong},
  title = "TokAlign: Efficient Vocabulary Adaptation via Token Alignment",
  booktitle = "Proceedings of the 63nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
  year = "2025",
  address = "Vienna, Austria",
  publisher = "Association for Computational Linguistics",
}
Downloads last month
-
Safetensors
Model size
7B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support