|
|
--- |
|
|
{} |
|
|
--- |
|
|
# SVQVAE (Scalable Vector Quantized Variational Autoencoder) |
|
|
|
|
|
Github: https://github.com/Open-Model-Initiative/SVQVAE |
|
|
|
|
|
|
|
|
A scalable Vector Quantized Variational Autoencoder (VQVAE) for high-resolution image generation and reconstruction. This model supports tiled processing for handling large images efficiently. |
|
|
|
|
|
## Model Description |
|
|
|
|
|
SVQVAE is a scalable variant of the Vector Quantized Variational Autoencoder that can process high-resolution images through tiled encoding and decoding. The model uses a discrete codebook to compress images into a latent representation and can reconstruct them at multiple scales. |
|
|
|
|
|
### Key Features |
|
|
|
|
|
- **Scalable Processing**: Handles high-resolution images through tiled processing |
|
|
- **Multi-scale Output**: Can generate reconstructions at different scales |
|
|
- **Vector Quantization**: Uses a discrete codebook for efficient compression |
|
|
- **Attention Mechanisms**: Includes self-attention blocks for better feature learning |
|
|
- **Flexible Architecture**: Configurable encoder/decoder with customizable channel multipliers |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this code in your research, please cite Austin J. Bryant and the Open Model Initiative. |
|
|
|
|
|
## Acknowledgments |
|
|
|
|
|
This implementation is based on the VQVAE architecture and includes improvements for scalable processing of high-resolution images. |
|
|
|
|
|
## Repository Links |
|
|
|
|
|
- **GitHub Repository**: [Open-Model-Initiative/SVQVAE](https://github.com/Open-Model-Initiative/SVQVAE) |
|
|
- **Model Weights**: Available in this Hugging Face repository |
|
|
- **Documentation**: See the GitHub repository for detailed documentation and examples |
|
|
|
|
|
|
|
|
This model is licensed under the OpenMDW License Agreement (See LICENSE) |