AVIIAX

achiru commited on Nov 22, 2023

Commit

2812f25

0 Parent(s):

Duplicate from monster-labs/control_v1p_sd15_qrcode_monster

Browse files

Co-authored-by: Louis Mouhat <achiru@users.noreply.huggingface.co>

Files changed (16) hide show

.gitattributes +38 -0
README.md +58 -0
config.json +42 -0
control_v1p_sd15_qrcode_monster.ckpt +3 -0
control_v1p_sd15_qrcode_monster.safetensors +3 -0
control_v1p_sd15_qrcode_monster.yaml +80 -0
diffusion_pytorch_model.bin +3 -0
diffusion_pytorch_model.safetensors +3 -0
images/architecture.png +3 -0
images/monster.png +3 -0
images/skulls.png +3 -0
images/tree.png +3 -0
v2/config.json +42 -0
v2/control_v1p_sd15_qrcode_monster_v2.safetensors +3 -0
v2/control_v1p_sd15_qrcode_monster_v2.yaml +80 -0
v2/diffusion_pytorch_model.safetensors +3 -0

.gitattributes ADDED Viewed

	@@ -0,0 +1,38 @@

+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text
+diffusion_pytorch_model.bin filter=lfs diff=lfs merge=lfs -text
+*.png filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,58 @@

+---
+tags:
+- stable-diffusion
+- controlnet
+- qrcode
+license: openrail++
+language:
+- en
+---
+# Controlnet QR Code Monster v2 For SD-1.5
+![QR code in shape of a blue monster, reading "https://qrcode.monster"](images/monster.png)
+##  Model Description
+This model is made to generate creative QR codes that still scan.
+Keep in mind that not all generated codes might be readable, but you can try different parameters and prompts to get the desired results.
+**NEW VERSION**
+Introducing the upgraded version of our model - Controlnet QR code Monster v2.
+V2 is a huge upgrade over v1, for scannability AND creativity.
+QR codes can now seamlessly blend the image by using a gray-colored background (#808080).
+As with the former version, the readability of some generated codes may vary, however playing around with parameters and prompts could yield better results.
+You can find in in the `v2/` subfolder.
+## How to Use
+- **Condition**: QR codes are passed as condition images with a module size of 16px. Use a higher error correction level to make it easier to read (sometimes a lower level can be easier to read if smaller in size). Use a gray background for the rest of the image to make the code integrate better.
+- **Prompts**: Use a prompt to guide the QR code generation. The output will highly depend on the given prompt. Some seem to be really easily accepted by the qr code process, some will require careful tweaking to get good results.
+- **Controlnet guidance scale**: Set the controlnet guidance scale value:
+   - High values: The generated QR code will be more readable.
+   - Low values: The generated QR code will be more creative.
+### Tips
+- For an optimally readable output, try generating multiple QR codes with similar parameters, then choose the best ones.
+- Use the Image-to-Image feature to improve the readability of a generated QR code:
+  - Decrease the denoising strength to retain more of the original image.
+  - Increase the controlnet guidance scale value for better readability.
+  A typical workflow for "saving" a code would be :
+  Max out the guidance scale and minimize the denoising strength, then bump the strength until the code scans.
+## Example Outputs
+Here are some examples of creative, yet scannable QR codes produced by our model:
+![City ruins with a building facade in shape of a QR code, reading "https://qrcode.monster"](images/architecture.png)
+![QR code in shape of a tree, reading "https://qrcode.monster"](images/tree.png)
+![A gothic sculpture in shape of a QR code, reading "https://qrcode.monster"](images/skulls.png)
+Feel free to experiment with prompts, parameters, and the Image-to-Image feature to achieve the desired QR code output. Good luck and have fun!

config.json ADDED Viewed

	@@ -0,0 +1,42 @@

+{
+  "_class_name": "ControlNetModel",
+  "_diffusers_version": "0.17.0.dev0",
+  "act_fn": "silu",
+  "attention_head_dim": 8,
+  "block_out_channels": [
+    320,
+    640,
+    1280,
+    1280
+  ],
+  "class_embed_type": null,
+  "conditioning_embedding_out_channels": [
+    16,
+    32,
+    96,
+    256
+  ],
+  "controlnet_conditioning_channel_order": "rgb",
+  "cross_attention_dim": 768,
+  "down_block_types": [
+    "CrossAttnDownBlock2D",
+    "CrossAttnDownBlock2D",
+    "CrossAttnDownBlock2D",
+    "DownBlock2D"
+  ],
+  "downsample_padding": 1,
+  "flip_sin_to_cos": true,
+  "freq_shift": 0,
+  "global_pool_conditions": false,
+  "in_channels": 4,
+  "layers_per_block": 2,
+  "mid_block_scale_factor": 1,
+  "norm_eps": 1e-05,
+  "norm_num_groups": 32,
+  "num_class_embeds": null,
+  "only_cross_attention": false,
+  "projection_class_embeddings_input_dim": null,
+  "resnet_time_scale_shift": "default",
+  "upcast_attention": false,
+  "use_linear_projection": false
+}

control_v1p_sd15_qrcode_monster.ckpt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1b5d69da6d00efd8eb8d1f4ba56152ae1b420d6fd55c65813bdb5773c487d4dd
+size 1445200597

control_v1p_sd15_qrcode_monster.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c7f43f70e266153d12f5e1bb1c9e7be3f4513cf0eef0432661b1331bfe11cadf
+size 722596344

control_v1p_sd15_qrcode_monster.yaml ADDED Viewed

	@@ -0,0 +1,80 @@

+model:
+  target: cldm.cldm.ControlLDM
+  params:
+    linear_start: 0.00085
+    linear_end: 0.0120
+    num_timesteps_cond: 1
+    log_every_t: 200
+    timesteps: 1000
+    first_stage_key: "jpg"
+    cond_stage_key: "txt"
+    control_key: "hint"
+    image_size: 64
+    channels: 4
+    cond_stage_trainable: false
+    conditioning_key: crossattn
+    monitor: val/loss_simple_ema
+    scale_factor: 0.18215
+    use_ema: False
+    only_mid_control: False
+    control_stage_config:
+      target: cldm.cldm.ControlNet
+      params:
+        image_size: 32 # unused
+        in_channels: 4
+        hint_channels: 3
+        model_channels: 320
+        attention_resolutions: [ 4, 2, 1 ]
+        num_res_blocks: 2
+        channel_mult: [ 1, 2, 4, 4 ]
+        num_heads: 8
+        use_spatial_transformer: True
+        transformer_depth: 1
+        context_dim: 768
+        use_checkpoint: True
+        legacy: False
+    unet_config:
+      target: cldm.cldm.ControlledUnetModel
+      params:
+        image_size: 32 # unused
+        in_channels: 4
+        out_channels: 4
+        model_channels: 320
+        attention_resolutions: [ 4, 2, 1 ]
+        num_res_blocks: 2
+        channel_mult: [ 1, 2, 4, 4 ]
+        num_heads: 8
+        use_spatial_transformer: True
+        transformer_depth: 1
+        context_dim: 768
+        use_checkpoint: True
+        legacy: False
+    first_stage_config:
+      target: ldm.models.autoencoder.AutoencoderKL
+      params:
+        embed_dim: 4
+        monitor: val/rec_loss
+        ddconfig:
+          double_z: true
+          z_channels: 4
+          resolution: 256
+          in_channels: 3
+          out_ch: 3
+          ch: 128
+          ch_mult:
+          - 1
+          - 2
+          - 4
+          - 4
+          num_res_blocks: 2
+          attn_resolutions: []
+          dropout: 0.0
+        lossconfig:
+          target: torch.nn.Identity
+    cond_stage_config:
+      target: ldm.modules.encoders.modules.FrozenCLIPEmbedder

diffusion_pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:29bc1cbd7de2e4bee030184321724bd0c6a1724dd815eb8e593379c39badfdf2
+size 1445259705

diffusion_pytorch_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8c10457d104cf6adc318fee3a981d3a1a2e60796259e54420a0f933a68dbdeda
+size 1445157120

images/architecture.png ADDED Viewed

Git LFS Details

SHA256: 0ee8143ec1d4b4bf1517d737696e9eb387ae1a10d280209be1ae7dff48288b65
Pointer size: 132 Bytes
Size of remote file: 1.1 MB

images/monster.png ADDED Viewed

Git LFS Details

SHA256: 2ec088ca0ed6c3da320b25b19731ff621cbc4d05660d0eeb92e365cb552df235
Pointer size: 131 Bytes
Size of remote file: 817 kB

images/skulls.png ADDED Viewed

Git LFS Details

SHA256: 4e3d6a86fa701dd0a2bb0a58d81963bed6adc3ffad3ec787392f972ee5abb1ef
Pointer size: 132 Bytes
Size of remote file: 1.05 MB

images/tree.png ADDED Viewed

Git LFS Details

SHA256: 3d99a8f79d50b410d8dae98d0eefd891d71a0c7c2303d90d85566cc5811c1a15
Pointer size: 132 Bytes
Size of remote file: 1.31 MB

v2/config.json ADDED Viewed

	@@ -0,0 +1,42 @@

+{
+  "_class_name": "ControlNetModel",
+  "_diffusers_version": "0.17.0.dev0",
+  "act_fn": "silu",
+  "attention_head_dim": 8,
+  "block_out_channels": [
+    320,
+    640,
+    1280,
+    1280
+  ],
+  "class_embed_type": null,
+  "conditioning_embedding_out_channels": [
+    16,
+    32,
+    96,
+    256
+  ],
+  "controlnet_conditioning_channel_order": "rgb",
+  "cross_attention_dim": 768,
+  "down_block_types": [
+    "CrossAttnDownBlock2D",
+    "CrossAttnDownBlock2D",
+    "CrossAttnDownBlock2D",
+    "DownBlock2D"
+  ],
+  "downsample_padding": 1,
+  "flip_sin_to_cos": true,
+  "freq_shift": 0,
+  "global_pool_conditions": false,
+  "in_channels": 4,
+  "layers_per_block": 2,
+  "mid_block_scale_factor": 1,
+  "norm_eps": 1e-05,
+  "norm_num_groups": 32,
+  "num_class_embeds": null,
+  "only_cross_attention": false,
+  "projection_class_embeddings_input_dim": null,
+  "resnet_time_scale_shift": "default",
+  "upcast_attention": false,
+  "use_linear_projection": false
+}

v2/control_v1p_sd15_qrcode_monster_v2.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:fc985da5850a03033c9e28032532f406ae04bd127178ae5bc6d3ec0502b25253
+size 722596344

v2/control_v1p_sd15_qrcode_monster_v2.yaml ADDED Viewed

	@@ -0,0 +1,80 @@

+model:
+  target: cldm.cldm.ControlLDM
+  params:
+    linear_start: 0.00085
+    linear_end: 0.0120
+    num_timesteps_cond: 1
+    log_every_t: 200
+    timesteps: 1000
+    first_stage_key: "jpg"
+    cond_stage_key: "txt"
+    control_key: "hint"
+    image_size: 64
+    channels: 4
+    cond_stage_trainable: false
+    conditioning_key: crossattn
+    monitor: val/loss_simple_ema
+    scale_factor: 0.18215
+    use_ema: False
+    only_mid_control: False
+    control_stage_config:
+      target: cldm.cldm.ControlNet
+      params:
+        image_size: 32 # unused
+        in_channels: 4
+        hint_channels: 3
+        model_channels: 320
+        attention_resolutions: [ 4, 2, 1 ]
+        num_res_blocks: 2
+        channel_mult: [ 1, 2, 4, 4 ]
+        num_heads: 8
+        use_spatial_transformer: True
+        transformer_depth: 1
+        context_dim: 768
+        use_checkpoint: True
+        legacy: False
+    unet_config:
+      target: cldm.cldm.ControlledUnetModel
+      params:
+        image_size: 32 # unused
+        in_channels: 4
+        out_channels: 4
+        model_channels: 320
+        attention_resolutions: [ 4, 2, 1 ]
+        num_res_blocks: 2
+        channel_mult: [ 1, 2, 4, 4 ]
+        num_heads: 8
+        use_spatial_transformer: True
+        transformer_depth: 1
+        context_dim: 768
+        use_checkpoint: True
+        legacy: False
+    first_stage_config:
+      target: ldm.models.autoencoder.AutoencoderKL
+      params:
+        embed_dim: 4
+        monitor: val/rec_loss
+        ddconfig:
+          double_z: true
+          z_channels: 4
+          resolution: 256
+          in_channels: 3
+          out_ch: 3
+          ch: 128
+          ch_mult:
+          - 1
+          - 2
+          - 4
+          - 4
+          num_res_blocks: 2
+          attn_resolutions: []
+          dropout: 0.0
+        lossconfig:
+          target: torch.nn.Identity
+    cond_stage_config:
+      target: ldm.modules.encoders.modules.FrozenCLIPEmbedder

v2/diffusion_pytorch_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a95dc35dfff997bd9c751a97db4959db1235f5776ef5e453d468c602c7998794
+size 1445157120