nielsr HF Staff commited on
Commit
4fe0504
·
verified ·
1 Parent(s): 7329760

Add model card, link to code, project page

Browse files

This PR adds a model card for the paper [LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis](https://huggingface.co/papers/2503.21749).

Files changed (1) hide show
  1. README.md +14 -5
README.md CHANGED
@@ -1,17 +1,22 @@
1
  ---
2
- license: mit
 
3
  datasets:
4
  - X-ART/LeX-10K
5
- pipeline_tag: text-to-image
6
  library_name: diffusers
 
 
7
  tags:
8
  - art
9
  - text-rendering
10
- base_model:
11
- - Alpha-VLLM/Lumina-Image-2.0
12
  ---
 
13
  **LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis**
14
 
 
 
 
 
15
  We introduce LeX-Art, a comprehensive suite for high-quality text-image synthesis that systematically bridges the gap between prompt expressiveness and text rendering fidelity. Our approach follows a data-centric paradigm, constructing a high-quality data synthesis pipeline based on Deepseek-R1 to curate LeX-10K, a dataset of 10K high-resolution, aesthetically refined 1024$\times$1024 images. Beyond dataset construction, we develop LeX-Enhancer, a robust prompt enrichment model, and train two text-to-image models, LeX-FLUX and LeX-Lumina, achieving state-of-the-art text rendering performance. To systematically evaluate visual text generation, we introduce LeX-Bench, a benchmark that assesses fidelity, aesthetics, and alignment, complemented by Pairwise Normalized Edit Distance (PNED), a novel metric for robust text accuracy evaluation. Experiments demonstrate significant improvements, with LeX-Lumina achieving a 22.16\% PNED gain, and LeX-FLUX outperforming baselines in color (+10.32\%), positional (+5.60\%), and font accuracy (+5.63\%). The codes, models, datasets, and demo are publicly available.
16
  ![demo](teaser.png)
17
  **Usage of LeX-Lumina:**
@@ -37,4 +42,8 @@ image = pipe(
37
 
38
  ).images[0]
39
  image.save("lex_lumina_demo.png")
40
- ```
 
 
 
 
 
1
  ---
2
+ base_model:
3
+ - Alpha-VLLM/Lumina-Image-2.0
4
  datasets:
5
  - X-ART/LeX-10K
 
6
  library_name: diffusers
7
+ license: mit
8
+ pipeline_tag: text-to-image
9
  tags:
10
  - art
11
  - text-rendering
 
 
12
  ---
13
+
14
  **LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis**
15
 
16
+ This repository contains the model presented in the paper [LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis](https://huggingface.co/papers/2503.21749).
17
+
18
+ The abstract of the paper is the following:
19
+
20
  We introduce LeX-Art, a comprehensive suite for high-quality text-image synthesis that systematically bridges the gap between prompt expressiveness and text rendering fidelity. Our approach follows a data-centric paradigm, constructing a high-quality data synthesis pipeline based on Deepseek-R1 to curate LeX-10K, a dataset of 10K high-resolution, aesthetically refined 1024$\times$1024 images. Beyond dataset construction, we develop LeX-Enhancer, a robust prompt enrichment model, and train two text-to-image models, LeX-FLUX and LeX-Lumina, achieving state-of-the-art text rendering performance. To systematically evaluate visual text generation, we introduce LeX-Bench, a benchmark that assesses fidelity, aesthetics, and alignment, complemented by Pairwise Normalized Edit Distance (PNED), a novel metric for robust text accuracy evaluation. Experiments demonstrate significant improvements, with LeX-Lumina achieving a 22.16\% PNED gain, and LeX-FLUX outperforming baselines in color (+10.32\%), positional (+5.60\%), and font accuracy (+5.63\%). The codes, models, datasets, and demo are publicly available.
21
  ![demo](teaser.png)
22
  **Usage of LeX-Lumina:**
 
42
 
43
  ).images[0]
44
  image.save("lex_lumina_demo.png")
45
+ ```
46
+
47
+ See also:
48
+ * [Project page](https://zhaoshitian.github.io/lexart/)
49
+ * [Code](https://github.com/zhaoshitian/LeX-Art)