fantos commited on
Commit
b0483fe
·
verified ·
1 Parent(s): b4de204

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -18
README.md CHANGED
@@ -8,30 +8,31 @@ sdk_version: 6.3.0
8
  app_file: app.py
9
  pinned: false
10
  license: openrail++
11
- short_description: 'FLUX 8 Step Fast & High Quality Mode'
12
  ---
13
- Introduction
14
- GLM-Image is an image generation model adopts a hybrid autoregressive + diffusion decoder architecture. In general image generation quality, GLM‑Image aligns with mainstream latent diffusion approaches, but it shows significant advantages in text-rendering and knowledge‑intensive generation scenarios. It performs especially well in tasks requiring precise semantic understanding and complex information expression, while maintaining strong capabilities in high‑fidelity and fine‑grained detail generation. In addition to text‑to‑image generation, GLM‑Image also supports a rich set of image‑to‑image tasks including image editing, style transfer, identity‑preserving generation, and multi‑subject consistency.
15
 
16
- Model architecture: a hybrid autoregressive + diffusion decoder design.
 
17
 
18
- Autoregressive generator: a 9B-parameter model initialized from GLM-4-9B-0414, with an expanded vocabulary to incorporate visual tokens. The model first generates a compact encoding of approximately 256 tokens, then expands to 1K–4K tokens, corresponding to 1K–2K high-resolution image outputs.
19
- Diffusion Decoder: a 7B-parameter decoder based on a single-stream DiT architecture for latent-space image decoding. It is equipped with a Glyph Encoder text module, significantly improving accurate text rendering within images.
20
- architecture_2
21
 
22
- Post-training with decoupled reinforcement learning: the model introduces a fine-grained, modular feedback strategy using the GRPO algorithm, substantially enhancing both semantic understanding and visual detail quality.
 
 
23
 
24
- Autoregressive module: provides low-frequency feedback signals focused on aesthetics and semantic alignment, improving instruction following and artistic expressiveness.
25
- Decoder module: delivers high-frequency feedback targeting detail fidelity and text accuracy, resulting in highly realistic textures as well as more precise text rendering.
26
- GLM-Image supports both text-to-image and image-to-image generation within a single model.
27
 
28
- Text-to-image: generates high-detail images from textual descriptions, with particularly strong performance in information-dense scenarios.
29
- Image-to-image: supports a wide range of tasks, including image editing, style transfer, multi-subject consistency, and identity-preserving generation for people and objects.
 
30
 
31
- License
32
- The overall GLM-Image model is released under the MIT License.
33
 
34
- This project incorporates the VQ tokenizer weights and VIT weights from X-Omni/X-Omni-En, which are licensed under the Apache License, Version 2.0.
35
-
36
- The VQ tokenizer and VIT weights remains subject to the original Apache-2.0 terms. Users should comply with the respective licenses when using this component.
37
 
 
 
8
  app_file: app.py
9
  pinned: false
10
  license: openrail++
11
+ short_description: 'AI image generator with precise text rendering'
12
  ---
13
+ # 🎨 GLM-Image - AI Image Generator
 
14
 
15
+ ## Overview
16
+ A 9-billion parameter AI model that generates high-quality images from text prompts
17
 
18
+ ## Key Features
 
 
19
 
20
+ **🔤 Precise Text Rendering**
21
+ - Generates accurate, readable text within images
22
+ - Perfect for logos, posters, signs, and text-heavy visuals
23
 
24
+ **🖼️ Versatile Generation Modes**
25
+ - Text-to-Image: Create images from text descriptions
26
+ - Image-to-Image: Edit, style transfer, and identity-preserving generation
27
 
28
+ **⚡ Hybrid Architecture**
29
+ - Autoregressive + Diffusion combined structure
30
+ - 9B generator + 7B decoder for high-resolution output
31
 
32
+ **🎯 RL-Optimized Quality**
33
+ - Fine-tuned with GRPO algorithm for enhanced details and aesthetics
34
 
35
+ ## 📜 License
36
+ MIT License (Commercial use allowed)
 
37
 
38
+ #GLMImage #AIImageGeneration #TextToImage #ImageToImage #TextRendering #AIArt #ImageEditing #StyleTransfer #OpenSourceAI #HuggingFace #Diffusion #FreeAI #GenerativeAI