| | --- |
| | license: mit |
| | datasets: |
| | - open-thoughts/OpenThoughts-114k |
| | language: |
| | - en |
| | base_model: |
| | - Qwen/Qwen3-1.7B |
| | - Qwen/Qwen3-4B |
| | - Qwen/Qwen2.5-1.5B-Instruct |
| | tags: |
| | - vae |
| | --- |
| | |
| | # VAE Layer for the Research *Gated Latent Reasoning Loop* (tentative name) |
| |
|
| | > Please refer to our code: [https://github.com/elliot-zzh/from-transparent-to-opaque](https://github.com/elliot-zzh/from-transparent-to-opaque). |
| | > The project is under construction, and we will publish the paper once we are ready. |
| |
|
| | This is the pretrained VAE layer for the research *Gated Latent Reasoning Loop* (tentative name). |
| |
|
| | There are 3 VAEs, and they are applied to different models: |
| |
|
| | - [`vae_epoch10.pth`](https://huggingface.co/ethangoh7086cmd/gated-latent-reasoning-loop-vae/blob/main/vae_epoch10.pth): The VAE for [Qwen3-4B](https://huggingface.co/Qwen/Qwen3-4B). The input size is 4096 (although this is not confirmed). |
| | - [`vae_epoch15.pth`](https://huggingface.co/ethangoh7086cmd/gated-latent-reasoning-loop-vae/blob/main/vae_epoch15.pth): The VAE for [Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B). The input size is 2048. |
| | - [`vae_epoch14.pth`](https://huggingface.co/ethangoh7086cmd/gated-latent-reasoning-loop-vae/blob/main/vae_epoch14.pth): The VAE for [Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct). The input size is 1536. |
| |
|
| | The structure of the VAE involves two linear layers, including the compressor and the uncompressor. |