ethangoh7086cmd
/

gated-latent-reasoning-loop-vae

Model card Files Files and versions

gated-latent-reasoning-loop-vae / README.md

ethangoh7086cmd's picture

ethangoh7086cmd

Update README.md

290d074 verified 10 months ago

|

history blame contribute delete

1.42 kB

	---
	license: mit
	datasets:
	- open-thoughts/OpenThoughts-114k
	language:
	- en
	base_model:
	- Qwen/Qwen3-1.7B
	- Qwen/Qwen3-4B
	- Qwen/Qwen2.5-1.5B-Instruct
	tags:
	- vae
	---

	# VAE Layer for the Research Gated Latent Reasoning Loop (tentative name)

	> Please refer to our code: [https://github.com/elliot-zzh/from-transparent-to-opaque](https://github.com/elliot-zzh/from-transparent-to-opaque).
	> The project is under construction, and we will publish the paper once we are ready.

	This is the pretrained VAE layer for the research Gated Latent Reasoning Loop (tentative name).

	There are 3 VAEs, and they are applied to different models:

	- [`vae_epoch10.pth`](https://huggingface.co/ethangoh7086cmd/gated-latent-reasoning-loop-vae/blob/main/vae_epoch10.pth): The VAE for [Qwen3-4B](https://huggingface.co/Qwen/Qwen3-4B). The input size is 4096 (although this is not confirmed).
	- [`vae_epoch15.pth`](https://huggingface.co/ethangoh7086cmd/gated-latent-reasoning-loop-vae/blob/main/vae_epoch15.pth): The VAE for [Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B). The input size is 2048.
	- [`vae_epoch14.pth`](https://huggingface.co/ethangoh7086cmd/gated-latent-reasoning-loop-vae/blob/main/vae_epoch14.pth): The VAE for [Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct). The input size is 1536.

	The structure of the VAE involves two linear layers, including the compressor and the uncompressor.