zRzRzRzRzRzRzR
commited on
Commit
·
2c433cc
1
Parent(s):
a32e429
offlaod
Browse files
README.md
CHANGED
|
@@ -165,7 +165,7 @@ curl -s -X POST "http://localhost:30000/v1/images/edits" \
|
|
| 165 |
+ Please ensure that all text intended to be rendered in the image is enclosed in quotation marks in the model input and We strongly recommend using GLM-4.7 to enhance prompts for higher image quality. Please check [our github script](https://raw.githubusercontent.com/zai-org/GLM-Image/refs/heads/main/examples/prompt_utils.py) for more details.
|
| 166 |
+ The AR model used in GLM‑Image is configured with `do_sample=True`, a temperature of `0.9`, and a topp of `0.75` by default. A higher temperature results in more diverse and rich outputs, but it can also lead to a certain decrease in output stability.
|
| 167 |
+ The target image resolution must be divisible by 32. Otherwise, it will throw an error.
|
| 168 |
-
+ Because
|
| 169 |
+ vLLM-Omni and SGLang (with AR speedup) support is currently being integrated — stay tuned. For inference cost, you can check in our github.
|
| 170 |
|
| 171 |
## Model Performance
|
|
|
|
| 165 |
+ Please ensure that all text intended to be rendered in the image is enclosed in quotation marks in the model input and We strongly recommend using GLM-4.7 to enhance prompts for higher image quality. Please check [our github script](https://raw.githubusercontent.com/zai-org/GLM-Image/refs/heads/main/examples/prompt_utils.py) for more details.
|
| 166 |
+ The AR model used in GLM‑Image is configured with `do_sample=True`, a temperature of `0.9`, and a topp of `0.75` by default. A higher temperature results in more diverse and rich outputs, but it can also lead to a certain decrease in output stability.
|
| 167 |
+ The target image resolution must be divisible by 32. Otherwise, it will throw an error.
|
| 168 |
+
+ Because inference optimizations for this architecture are currently limited, the runtime cost is still relatively high. You can set `enable_model_cpu_offload=True` to run it with `~23GB` of GPU memory, at the cost of slower inference.
|
| 169 |
+ vLLM-Omni and SGLang (with AR speedup) support is currently being integrated — stay tuned. For inference cost, you can check in our github.
|
| 170 |
|
| 171 |
## Model Performance
|