--- license: mit datasets: - open-thoughts/OpenThoughts-114k language: - en base_model: - Qwen/Qwen3-1.7B - Qwen/Qwen3-4B - Qwen/Qwen2.5-1.5B-Instruct tags: - vae --- # VAE Layer for the Research *Gated Latent Reasoning Loop* (tentative name) > Please refer to our code: [https://github.com/elliot-zzh/from-transparent-to-opaque](https://github.com/elliot-zzh/from-transparent-to-opaque). > The project is under construction, and we will publish the paper once we are ready. This is the pretrained VAE layer for the research *Gated Latent Reasoning Loop* (tentative name). There are 3 VAEs, and they are applied to different models: - [`vae_epoch10.pth`](https://huggingface.co/ethangoh7086cmd/gated-latent-reasoning-loop-vae/blob/main/vae_epoch10.pth): The VAE for [Qwen3-4B](https://huggingface.co/Qwen/Qwen3-4B). The input size is 4096 (although this is not confirmed). - [`vae_epoch15.pth`](https://huggingface.co/ethangoh7086cmd/gated-latent-reasoning-loop-vae/blob/main/vae_epoch15.pth): The VAE for [Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B). The input size is 2048. - [`vae_epoch14.pth`](https://huggingface.co/ethangoh7086cmd/gated-latent-reasoning-loop-vae/blob/main/vae_epoch14.pth): The VAE for [Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct). The input size is 1536. The structure of the VAE involves two linear layers, including the compressor and the uncompressor.