yumpyy commited on
feat: add desi-max model weights and demo images
Browse files- .gitattributes +7 -0
- README.md +67 -48
- configuration.json +1 -0
- demo/bharat-ai.png +3 -0
- demo/bournvita.png +3 -0
- demo/gen-1.jpeg +3 -0
- demo/gen-6-1.jpeg +3 -0
- demo/google-2.png +3 -0
- demo/pulse.png +3 -0
- desi-max_10.safetensors +3 -0
- desi-max_5.safetensors +3 -0
.gitattributes
CHANGED
|
@@ -45,6 +45,13 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 45 |
*.wasm filter=lfs diff=lfs merge=lfs -text
|
| 46 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 47 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 48 |
*.png filter=lfs diff=lfs merge=lfs -text
|
| 49 |
*.jpeg filter=lfs diff=lfs merge=lfs -text
|
| 50 |
*.jpg filter=lfs diff=lfs merge=lfs -text
|
|
|
|
| 45 |
*.wasm filter=lfs diff=lfs merge=lfs -text
|
| 46 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 47 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 48 |
+
desi-max_10.safetensors filter=lfs diff=lfs merge=lfs -text
|
| 49 |
+
demo/gen-6-1.jpeg filter=lfs diff=lfs merge=lfs -text
|
| 50 |
+
demo/google-2.png filter=lfs diff=lfs merge=lfs -text
|
| 51 |
+
demo/pulse.png filter=lfs diff=lfs merge=lfs -text
|
| 52 |
+
demo/bharat-ai.png filter=lfs diff=lfs merge=lfs -text
|
| 53 |
+
demo/bournvita.png filter=lfs diff=lfs merge=lfs -text
|
| 54 |
+
demo/gen-1.jpeg filter=lfs diff=lfs merge=lfs -text
|
| 55 |
*.png filter=lfs diff=lfs merge=lfs -text
|
| 56 |
*.jpeg filter=lfs diff=lfs merge=lfs -text
|
| 57 |
*.jpg filter=lfs diff=lfs merge=lfs -text
|
README.md
CHANGED
|
@@ -1,55 +1,74 @@
|
|
| 1 |
---
|
| 2 |
-
base_model: Qwen/Qwen-Image-2512
|
| 3 |
-
license:
|
| 4 |
tags:
|
| 5 |
-
-
|
| 6 |
-
- text-to-image
|
| 7 |
-
|
| 8 |
-
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
vision_foundation: QWEN_IMAGE_20_B
|
| 12 |
-
|
| 13 |
-
#model-type:
|
| 14 |
-
##such as gpt、phi、llama、chatglm、baichuan, etc.
|
| 15 |
-
#- gpt
|
| 16 |
-
|
| 17 |
-
#domain:
|
| 18 |
-
##such as nlp、cv、audio、multi-modal, etc.
|
| 19 |
-
#- nlp
|
| 20 |
-
|
| 21 |
-
#language:
|
| 22 |
-
##language code list https://help.aliyun.com/document_detail/215387.html?spm=a2c4g.11186623.0.0.9f8d7467kni6Aa
|
| 23 |
-
#- cn
|
| 24 |
-
|
| 25 |
-
#metrics:
|
| 26 |
-
##such as CIDEr、Blue、ROUGE, etc.
|
| 27 |
-
#- CIDEr
|
| 28 |
-
|
| 29 |
-
#tags:
|
| 30 |
-
##various custom tags, including pretrained, fine-tuned, instruction-tuned, RL-tuned, and others
|
| 31 |
-
#- pretrained
|
| 32 |
-
|
| 33 |
-
#tools:
|
| 34 |
-
##such as vllm、fastchat、llamacpp、AdaSeq, etc.
|
| 35 |
-
#- vllm
|
| 36 |
---
|
| 37 |
-
### You are viewing the default Readme template as no detailed model-card was provided by the model’s contributors. You can access the model files in the "Files and versions" tab.
|
| 38 |
-
#### Model files may be downloaded with ModelScope SDK or through git clone directly.
|
| 39 |
|
| 40 |
-
|
| 41 |
-
|
| 42 |
-
|
| 43 |
-
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
|
| 47 |
-
|
| 48 |
-
|
| 49 |
-
|
| 50 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 51 |
```
|
| 52 |
-
|
|
|
|
|
|
|
| 53 |
```
|
| 54 |
|
| 55 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
base_model: Qwen/Qwen-Image-2512
|
| 3 |
+
license: apache-2.0
|
| 4 |
tags:
|
| 5 |
+
- lora
|
| 6 |
+
- text-to-image
|
| 7 |
+
- fine-tuned
|
| 8 |
+
- style-transfer
|
| 9 |
+
pipeline_tag: text-to-image
|
| 10 |
+
trigger_word: desi-max
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
---
|
|
|
|
|
|
|
| 12 |
|
| 13 |
+
# Desi Maximalism LoRA
|
| 14 |
+
|
| 15 |
+
<p align="center">
|
| 16 |
+
<img src="demo/gen-1.jpeg" width="30%" style="border-radius:6px; margin:4px"/>
|
| 17 |
+
<img src="demo/gen-6-1.jpeg" width="30%" style="border-radius:6px; margin:4px"/>
|
| 18 |
+
<img src="demo/bharat-ai.png" width="30%" style="border-radius:6px; margin:4px"/>
|
| 19 |
+
</p>
|
| 20 |
+
<p align="center">
|
| 21 |
+
<img src="demo/bournvita.png" width="30%" style="border-radius:6px; margin:4px"/>
|
| 22 |
+
<img src="demo/google-2.png" width="30%" style="border-radius:6px; margin:4px"/>
|
| 23 |
+
<img src="demo/pulse.png" width="30%" style="border-radius:6px; margin:4px"/>
|
| 24 |
+
</p>
|
| 25 |
+
|
| 26 |
+
---
|
| 27 |
+
|
| 28 |
+
## Model
|
| 29 |
+
|
| 30 |
+
| | |
|
| 31 |
+
|---|---|
|
| 32 |
+
| **Base Model** | `Qwen/Qwen-Image-2512` |
|
| 33 |
+
| **Vision Foundation** | Qwen2.5-VL · 20B parameters |
|
| 34 |
+
| **Fine-tuning Method** | LoRA |
|
| 35 |
+
| **Task** | Text-to-image · style transfer |
|
| 36 |
+
| **Trigger Word** | `desi-max` |
|
| 37 |
+
| **License** | Apache 2.0 |
|
| 38 |
+
|
| 39 |
+
---
|
| 40 |
+
|
| 41 |
+
## Dataset
|
| 42 |
+
|
| 43 |
+
78 handpicked images of vintage South Asian commercial print — matchbox labels, product packaging, film posters, and magazine ads (c. 1940–1985). Each image was manually selected to represent a distinct visual sub-pattern, keeping the dataset tight and avoiding style collapse.
|
| 44 |
+
|
| 45 |
+
---
|
| 46 |
+
|
| 47 |
+
## Training Target
|
| 48 |
+
|
| 49 |
+
The LoRA is optimised to reproduce:
|
| 50 |
+
|
| 51 |
+
- Bold flat colour blocking and high-contrast palettes
|
| 52 |
+
- Decorative borders, concentric rules, cartouche framing
|
| 53 |
+
- Halftone and offset print grain/texture
|
| 54 |
+
- Dense multi-scale typographic hierarchy
|
| 55 |
+
- Hand-painted illustration shading and exaggerated perspective
|
| 56 |
+
|
| 57 |
+
---
|
| 58 |
+
|
| 59 |
+
## Usage
|
| 60 |
+
|
| 61 |
+
Prepend and append `desi-max` to your prompt.
|
| 62 |
+
|
| 63 |
```
|
| 64 |
+
desi-max, vintage Indian matchbox label, GAJRAJ AUTO in large red lettering,
|
| 65 |
+
blue starburst, illustrated autorickshaw in green and pink, yellow background,
|
| 66 |
+
bold halftone print texture, mid-century South Asian commercial design, desi-max
|
| 67 |
```
|
| 68 |
|
| 69 |
+
---
|
| 70 |
+
|
| 71 |
+
## Limitations
|
| 72 |
+
|
| 73 |
+
- Devanagari / Tamil script accuracy is bounded by the base model's multilingual capability
|
| 74 |
+
- Optimised for flat illustrated aesthetics — not photorealism
|
configuration.json
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
{"aigc_model":true,"framework":"pytorch","model_file_location":"desi-max_10.safetensors","task":"text-to-image-synthesis"}
|
demo/bharat-ai.png
ADDED
|
Git LFS Details
|
demo/bournvita.png
ADDED
|
Git LFS Details
|
demo/gen-1.jpeg
ADDED
|
Git LFS Details
|
demo/gen-6-1.jpeg
ADDED
|
Git LFS Details
|
demo/google-2.png
ADDED
|
Git LFS Details
|
demo/pulse.png
ADDED
|
Git LFS Details
|
desi-max_10.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:82d7e08c7f12e1161155e90a84213dd514c1a1fa0c95ff4d4a8183eae6571ef7
|
| 3 |
+
size 236117024
|
desi-max_5.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ef49c32ff4b9f82fb13269c1fab702b51de0881cf90689ccd16c1b24a1f507ff
|
| 3 |
+
size 236117024
|