GLM-IMAGE-PRO

Paused

App Files Files Community

fantos commited on Jan 19

Commit

b0483fe

verified ·

1 Parent(s): b4de204

Update README.md

Browse files

Files changed (1) hide show

README.md +19 -18

README.md CHANGED Viewed

@@ -8,30 +8,31 @@ sdk_version: 6.3.0
 app_file: app.py
 pinned: false
 license: openrail++
-short_description: 'FLUX 8 Step Fast & High Quality Mode'
 ---
-Introduction
-GLM-Image is an image generation model adopts a hybrid autoregressive + diffusion decoder architecture. In general image generation quality, GLM‑Image aligns with mainstream latent diffusion approaches, but it shows significant advantages in text-rendering and knowledge‑intensive generation scenarios. It performs especially well in tasks requiring precise semantic understanding and complex information expression, while maintaining strong capabilities in high‑fidelity and fine‑grained detail generation. In addition to text‑to‑image generation, GLM‑Image also supports a rich set of image‑to‑image tasks including image editing, style transfer, identity‑preserving generation, and multi‑subject consistency.
-Model architecture: a hybrid autoregressive + diffusion decoder design.
-Autoregressive generator: a 9B-parameter model initialized from GLM-4-9B-0414, with an expanded vocabulary to incorporate visual tokens. The model first generates a compact encoding of approximately 256 tokens, then expands to 1K–4K tokens, corresponding to 1K–2K high-resolution image outputs.
-Diffusion Decoder: a 7B-parameter decoder based on a single-stream DiT architecture for latent-space image decoding. It is equipped with a Glyph Encoder text module, significantly improving accurate text rendering within images.
-architecture_2
-Post-training with decoupled reinforcement learning: the model introduces a fine-grained, modular feedback strategy using the GRPO algorithm, substantially enhancing both semantic understanding and visual detail quality.
-Autoregressive module: provides low-frequency feedback signals focused on aesthetics and semantic alignment, improving instruction following and artistic expressiveness.
-Decoder module: delivers high-frequency feedback targeting detail fidelity and text accuracy, resulting in highly realistic textures as well as more precise text rendering.
-GLM-Image supports both text-to-image and image-to-image generation within a single model.
-Text-to-image: generates high-detail images from textual descriptions, with particularly strong performance in information-dense scenarios.
-Image-to-image: supports a wide range of tasks, including image editing, style transfer, multi-subject consistency, and identity-preserving generation for people and objects.
-License
-The overall GLM-Image model is released under the MIT License.
-This project incorporates the VQ tokenizer weights and VIT weights from X-Omni/X-Omni-En, which are licensed under the Apache License, Version 2.0.
-The VQ tokenizer and VIT weights remains subject to the original Apache-2.0 terms. Users should comply with the respective licenses when using this component.

 app_file: app.py
 pinned: false
 license: openrail++
+short_description: 'AI image generator with precise text rendering'
 ---
+# 🎨 GLM-Image - AI Image Generator
+## Overview
+A 9-billion parameter AI model that generates high-quality images from text prompts
+## ✨ Key Features
+**🔤 Precise Text Rendering**
+- Generates accurate, readable text within images
+- Perfect for logos, posters, signs, and text-heavy visuals
+**🖼️ Versatile Generation Modes**
+- Text-to-Image: Create images from text descriptions
+- Image-to-Image: Edit, style transfer, and identity-preserving generation
+**⚡ Hybrid Architecture**
+- Autoregressive + Diffusion combined structure
+- 9B generator + 7B decoder for high-resolution output
+**🎯 RL-Optimized Quality**
+- Fine-tuned with GRPO algorithm for enhanced details and aesthetics
+## 📜 License
+MIT License (Commercial use allowed)
+#GLMImage #AIImageGeneration #TextToImage #ImageToImage #TextRendering #AIArt #ImageEditing #StyleTransfer #OpenSourceAI #HuggingFace #Diffusion #FreeAI #GenerativeAI