--- license: mit tags: - discrete tokenization - autoregressive generation --- # InsightTok InsightTok is a discrete visual tokenizer designed to improve the fidelity of **text** and **faces**, two of the most challenging yet perceptually important structures in autoregressive image generation. It was introduced in the paper *InsightTok: Improving Text and Face Fidelity in Discrete Tokenization for Autoregressive Image Generation*. - **Paper:**: [InsightTok: Improving Text and Face Fidelity in Discrete Tokenization for Autoregressive Image Generation](https://huggingface.co/papers) - **Code:** [https://github.com/LeapLabTHU/InsightTok](https://github.com/LeapLabTHU/InsightTok) ## Model Details | Property | Value | |---|---:| | Downsampling rate | 16× | | Codebook size | 16,384 | | Latent dimension | 256 | | Number of parameters | 426M | ## Performance InsightTok achieves strong text and face reconstruction quality while maintaining a compact discrete representation.