Upload folder using huggingface_hub

Browse files

Files changed (3) hide show

README.md +13 -14
README_from_modelscope.md +3 -0
qwen_image_layered_control_bf16.safetensors +3 -0

README.md CHANGED Viewed

@@ -5,23 +5,22 @@ license: apache-2.0
 ## Model Introduction
-This model is trained based on [Qwen/Qwen-Image-Layered](https://modelscope.cn/models/Qwen/Qwen-Image-Layered) using the dataset [artplus/PrismLayersPro](https://modelscope.cn/datasets/artplus/PrismLayersPro), enabling text-controlled extraction of segmented image layers.
-For more details about training strategies and implementation, feel free to check our [technical blog](https://huggingface.co/blog/kelseye/qwen-image-layered-control).
 ## Usage Tips
-* The model architecture has been modified from multi-image output to single-image output, producing only the layer relevant to the textual description.
-* The model was trained exclusively on English text but inherits Chinese language understanding capabilities from the base model.
 * The native training resolution is 1024x1024; however, inference at other resolutions is supported.
-* The model struggles to separate multiple overlapping entities (e.g., the cartoon skeleton and hat in the examples).
-* The model excels at decomposing poster-like images but performs poorly on photographic images, especially those involving complex lighting and shadows.
-* Negative prompts are supported—use them to specify content you want excluded from the output.
 ## Demo Examples
-**Some images contain white text on light backgrounds. Users of ModelScope community should click the "☀︎" icon at the top-right corner to switch to dark mode for better visibility.**
 ### Example 1
@@ -64,8 +63,8 @@ For more details about training strategies and implementation, feel free to chec
 |Prompt|Output Image|Prompt|Output Image|
 |-|-|-|-|
-|蓝天，白云，一片花园，花园里有五颜六色的花|![](./assets/image_2_0_0.png)|五彩的精致花环|![](./assets/image_2_2_0.png)|
-|少女、花环、小猫|![](./assets/image_2_1_0.png)|少女、小猫|![](./assets/image_2_3_0.png)|
 </div>
@@ -87,8 +86,8 @@ For more details about training strategies and implementation, feel free to chec
 |Prompt|Output Image|Prompt|Output Image|
 |-|-|-|-|
-|一片湛蓝的天空和波涛汹涌的大海|![](./assets/image_3_0_0.png)|文字“向往的生活”|![](./assets/image_3_2_0.png)|
-|一只海鸥|![](./assets/image_3_1_0.png)|文字“生活”|![](./assets/image_3_3_0.png)|
 </div>
@@ -104,7 +103,7 @@ cd DiffSynth-Studio
 pip install -e .
 ```
-Model Inference:
 ```python
 from diffsynth.pipelines.qwen_image import QwenImagePipeline, ModelConfig

 ## Model Introduction
+This model is trained based on the model [Qwen/Qwen-Image-Layered](https://modelscope.cn/models/Qwen/Qwen-Image-Layered) using the dataset [artplus/PrismLayersPro](https://modelscope.cn/datasets/artplus/PrismLayersPro), enabling text-controlled extraction of segmented layers.
+For more details about training strategies and implementation, feel free to check our [technical blog](https://modelscope.cn/learn/4938).
 ## Usage Tips
+* The model architecture has been changed from multi-image output to single-image output, producing only the layer relevant to the provided text description.
+* The model was trained exclusively on English text, but retains Chinese language understanding capabilities inherited from the base model.
 * The native training resolution is 1024x1024; however, inference at other resolutions is supported.
+* The model struggles to separate multiple entities that are heavily occluded or overlapping, such as the cartoon skeleton head and hat in the examples.
+* The model excels at decomposing poster-like graphics but performs poorly on photographic images, especially those involving complex lighting and shadows.
+* The model supports negative prompts—users can specify content they wish to exclude via negative prompt descriptions.
 ## Demo Examples
+**Some images contain white text on light backgrounds. ModelScope users should click the "☀︎" icon in the top-right corner to switch to dark mode for better visibility.**
 ### Example 1
 |Prompt|Output Image|Prompt|Output Image|
 |-|-|-|-|
+|Blue sky, white clouds, a garden with colorful flowers|![](./assets/image_2_0_0.png)|Colorful, intricate floral wreath|![](./assets/image_2_2_0.png)|
+|Girl, wreath, kitten|![](./assets/image_2_1_0.png)|Girl, kitten|![](./assets/image_2_3_0.png)|
 </div>
 |Prompt|Output Image|Prompt|Output Image|
 |-|-|-|-|
+|A clear blue sky and a turbulent sea|![](./assets/image_3_0_0.png)|Text "The Life I Long For"|![](./assets/image_3_2_0.png)|
+|A seagull|![](./assets/image_3_1_0.png)|Text "Life"|![](./assets/image_3_3_0.png)|
 </div>
 pip install -e .
 ```
+Model inference:
 ```python
 from diffsynth.pipelines.qwen_image import QwenImagePipeline, ModelConfig

README_from_modelscope.md CHANGED Viewed

@@ -14,6 +14,9 @@ base_model_relation: finetune
 本模型基于模型 [Qwen/Qwen-Image-Layered](https://modelscope.cn/models/Qwen/Qwen-Image-Layered) 在数据集 [artplus/PrismLayersPro](https://modelscope.cn/datasets/artplus/PrismLayersPro) 上进行了训练，可以通过文本控制拆分的图层内容。
 ## 使用技巧
 * 模型结构从多图输出改为了单图输出，仅输出与文本描述相关的图层

 本模型基于模型 [Qwen/Qwen-Image-Layered](https://modelscope.cn/models/Qwen/Qwen-Image-Layered) 在数据集 [artplus/PrismLayersPro](https://modelscope.cn/datasets/artplus/PrismLayersPro) 上进行了训练，可以通过文本控制拆分的图层内容。
+更多关于训练策略和实现细节，欢迎查看我们的[技术博客](https://modelscope.cn/learn/4938)。
 ## 使用技巧
 * 模型结构从多图输出改为了单图输出，仅输出与文本描述相关的图层

qwen_image_layered_control_bf16.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:63b1966f0423bdc94d87273b8958de91e0a8f642c635f9113632d09cae3aa4ad
+size 40861043888