hgh
#1
by
kyle991212
- opened
- .gitattributes +0 -4
- FramePack-dance-lora-d8.safetensors +0 -3
- README.md +0 -150
- flux-hasui-lora-d4-sigmoid-raw-gs1.0.safetensors +0 -3
- fp-1f-chibi-1024.safetensors +0 -3
- fp-1f-kisekae-1024-v4-2-PfPHEMA.safetensors +0 -3
- fp-1f-kisekae-1024-v4-2.safetensors +0 -3
- joyo-kanji-lora-bw-v1-fp16.safetensors +0 -3
- omi-sample-lora-want2v1-3b.safetensors +0 -3
- penguin.png +0 -3
- qwenimage-blob_emoji-4-s020-6.safetensors +0 -3
- sdxl-negprompt8-v1.safetensors +0 -3
- sdxl-negprompt8-v1m.safetensors +0 -3
- sdxl_pixel_32_v1_ema_300000.safetensors +0 -3
- shrine.png +0 -3
- stable-cascade-c-lora-hasui-v01.safetensors +0 -3
- stable-cascade-c-lora-hasui-v02.safetensors +0 -3
- wd15b3-bad-v1.safetensors +0 -3
- wd15b3-neg-v1.safetensors +0 -3
- yellow_blob_1.png +0 -3
- yellow_blob_2.png +0 -3
.gitattributes
CHANGED
|
@@ -32,7 +32,3 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 32 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 33 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 35 |
-
penguin.png filter=lfs diff=lfs merge=lfs -text
|
| 36 |
-
shrine.png filter=lfs diff=lfs merge=lfs -text
|
| 37 |
-
yellow_blob_1.png filter=lfs diff=lfs merge=lfs -text
|
| 38 |
-
yellow_blob_2.png filter=lfs diff=lfs merge=lfs -text
|
|
|
|
| 32 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 33 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
|
|
|
FramePack-dance-lora-d8.safetensors
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:1f00ad5c4be420de0f5000837a79024e01a6052cda55b5ebe092412e2f6ed615
|
| 3 |
-
size 68985112
|
|
|
|
|
|
|
|
|
|
|
|
README.md
DELETED
|
@@ -1,150 +0,0 @@
|
|
| 1 |
-
## qwenimage-blob_emoji-4-s020-6.safetensors
|
| 2 |
-
|
| 3 |
-
Blob emoji LoRA.
|
| 4 |
-
|
| 5 |
-
The training captions are like `Yellow blob emoji with smiling face with smiling eyes. The background is gray.`, so `blob emoji` or `blob emoji with face ...` etc. act as trigger words.
|
| 6 |
-
|
| 7 |
-
- Blob emoji with face holds a sign says "Blob Emoji" in front of Japanese Shrine. --w 1024 --h 1024 --s 50 --d 1001
|
| 8 |
-

|
| 9 |
-
|
| 10 |
-
- Blob emoji face drives a red sport car along a curved road on a cliff overlooking the sea. The sea is dotted with whitecaps. The sky is blue, and cumulonimbus clouds float on the horizon. --w 1664 --h 928 --s 50 --d 12345678
|
| 11 |
-

|
| 12 |
-
|
| 13 |
-
### Dataset Creation Procedure
|
| 14 |
-
|
| 15 |
-
The dataset was created following these steps:
|
| 16 |
-
|
| 17 |
-
- The SVG files from [C1710/blobmoji](https://github.com/C1710/blobmoji) (licensed under ASL 2.0) were used. Specifically, 118 different yellow blob emojis were selected from the SVG files.
|
| 18 |
-
- `cairosvg` was used to convert these SVGs into 512x512 pixel transparent PNGs.
|
| 19 |
-
- A script was then used to pad the images to 640x640 pixels and generate four versions of each image with different background colors: white, light gray, gray, and black. This resulted in a total of 472 images.
|
| 20 |
-
- The captions were generated based on the official Unicode names of the emojis. The prefix `Yellow blob emoji with ` and the suffix `. The background is <color>.` were added to each name.
|
| 21 |
-
- For example: `Yellow blob emoji with smiling face with smiling eyes. The background is gray.`
|
| 22 |
-
- Note: For some emojis (e.g., devil, zombie), the word `Yellow` was omitted from the prefix.
|
| 23 |
-
|
| 24 |
-
### Dataset Definition
|
| 25 |
-
|
| 26 |
-
```
|
| 27 |
-
# general configurations
|
| 28 |
-
[general]
|
| 29 |
-
resolution = [640, 640]
|
| 30 |
-
batch_size = 16
|
| 31 |
-
enable_bucket = true
|
| 32 |
-
bucket_no_upscale = false
|
| 33 |
-
caption_extension = ".txt"
|
| 34 |
-
|
| 35 |
-
[[datasets]]
|
| 36 |
-
image_directory = "path/to/images_and_captions_dir"
|
| 37 |
-
cache_directory = "path/to/cache_dir"
|
| 38 |
-
```
|
| 39 |
-
|
| 40 |
-
### Training Command
|
| 41 |
-
|
| 42 |
-
```
|
| 43 |
-
accelerate launch --num_cpu_threads_per_process 1 --mixed_precision bf16 --rdzv_backend=c10d \
|
| 44 |
-
src/musubi_tuner/qwen_image_train_network.py \
|
| 45 |
-
--dit path/to/dit.safetensors --vae path/to/vae.safetensors \
|
| 46 |
-
--text_encoder path/to/vlm.safetensors \
|
| 47 |
-
--dataset_config path/to/blob_emoji_v1_640_bs16.toml \
|
| 48 |
-
--output_dir path/to/output_dir \
|
| 49 |
-
--learning_rate 2e-4 \
|
| 50 |
-
--timestep_sampling shift --weighting_scheme none --discrete_flow_shift 2.0 \
|
| 51 |
-
--max_train_epochs 16 --mixed_precision bf16 --seed 42 --gradient_checkpointing \
|
| 52 |
-
--network_module=networks.lora_qwen_image \
|
| 53 |
-
--network_dim=4 --network_args loraplus_lr_ratio=4 \
|
| 54 |
-
--save_every_n_epochs=1 --max_data_loader_n_workers 2 \
|
| 55 |
-
--persistent_data_loader_workers \
|
| 56 |
-
--logging_dir ./logs --log_prefix qwenimage-blob4-2e4- \
|
| 57 |
-
--output_name qwenimage-blob4-2e4 \
|
| 58 |
-
--optimizer_type adamw8bit --flash_attn --split_attn \
|
| 59 |
-
--log_with tensorboard \
|
| 60 |
-
--sample_every_n_epochs 1 --sample_prompts path/to/prompts_qwen_blob_emoji.txt \
|
| 61 |
-
--fp8_base --fp8_scaled
|
| 62 |
-
```
|
| 63 |
-
|
| 64 |
-
### Training Details
|
| 65 |
-
|
| 66 |
-
- Training was conducted on a Windows machine with a multi-GPU setup (2x RTX A6000).
|
| 67 |
-
- If you are not using a Windows environment or not performing multi-GPU training, please remove the `--rdzv_backend=c10d` argument.
|
| 68 |
-
- Please note that due to the 2-GPU setup, the effective batch size is 32. To achieve the same results with limited VRAM, increase the gradient accumulation steps. However, you should be able to train successfully with a lower batch size by adjusting the learning rate.
|
| 69 |
-
- The model was trained for 6 epochs (90 steps), which took approximately 1 hour with the Power Limit set to 60%.
|
| 70 |
-
- Finally, the weights from all 6 epochs were merged using the LoRA Post-Hoc EMA script from Musubi Tuner with `sigma_rel=0.2`.
|
| 71 |
-
|
| 72 |
-
## fp-1f-kisekae-1024-v4-2-PfPHEMA.safetensors
|
| 73 |
-
|
| 74 |
-
Post-Hoc EMA (with Power function sigma_rel=0.2) version of the following LoRA. The usage is the same.
|
| 75 |
-
|
| 76 |
-
## fp-1f-kisekae-1024-v4-2.safetensors
|
| 77 |
-
|
| 78 |
-
Experimental LoRA for FramePack One Frame kisekaeichi. The target index is 5. The prompt is as follows:
|
| 79 |
-
```
|
| 80 |
-
The girl stays in the same pose, but her outfit changes into a <costume description>, then she changes into another girl wearing the same outfit.
|
| 81 |
-
```
|
| 82 |
-
|
| 83 |
-
`costume description` is something like `school uniform` etc. A detailed description may improve the results. For example: "T-shirt with writing on it" or "Girl with long hair"
|
| 84 |
-
|
| 85 |
-
This model is trained with 1024x1024 resolution. Please use at roughly the same resolution.
|
| 86 |
-
|
| 87 |
-
## fp-1f-chibi-1024.safetensors
|
| 88 |
-
|
| 89 |
-
Experimental LoRA for FramePack One Frame Inference. The target index is 9. The prompt is as follows:
|
| 90 |
-
```
|
| 91 |
-
An anime character transforms: her head grows larger, her body becomes shorter and smaller, eyes become bigger and cuter. She turns into a chibi (super-deformed) version, with cartoonishly cute proportions. The transformation is quick and playful.
|
| 92 |
-
```
|
| 93 |
-
|
| 94 |
-
This model is trained with 1024x1024 resolution. Please use at roughly the same resolution. If the effect is too strong, lower the multiplier (strength) to 0.8 or less.
|
| 95 |
-
|
| 96 |
-
## FramePack-dance-lora-d8.safetensors
|
| 97 |
-
Experimental LoRA for FramePack. This is for testing purposes and the effect is weak. Please set the prompt to something like `A woman is spinning on her tiptoes` .
|
| 98 |
-
`.
|
| 99 |
-
|
| 100 |
-
|
| 101 |
-
## flux-hasui-lora-d4-sigmoid-raw-gs1.0.safetensors
|
| 102 |
-
Experimental LoRA for FLUX.1 dev.
|
| 103 |
-
|
| 104 |
-
Trained with `sd-scripts` (Aug. 11) `sd3` branch. __NOTE:__ This settings requires > 26GB VRAM. Please add `--fp8_base` to enable fp8 training to reduce VRAM usage.
|
| 105 |
-
|
| 106 |
-
```
|
| 107 |
-
accelerate launch --mixed_precision bf16 --num_cpu_threads_per_process 1 flux_train_network.py --pretrained_model_name_or_path flux1/flux1-dev.sft --clip_l sd3/clip_l.safetensors --t5xxl sd3/t5xxl_fp16.safetensors --ae flux1/ae_dev.sft --cache_latents_to_disk --save_model_as safetensors --sdpa --persistent_data_loader_workers --max_data_loader_n_workers 2 --seed 42 --gradient_checkpointing --mixed_precision bf16 --save_precision bf16 --network_module networks.lora_flux --network_dim 4 --optimizer_type adamw8bit --learning_rate 1e-3 --network_train_unet_only --cache_text_encoder_outputs --cache_text_encoder_outputs_to_disk --highvram --max_train_epochs 4 --save_every_n_epochs 1 --dataset_config hasui_1024_bs1.toml --output_dir flux/lora --output_name lora-name --timestep_sampling sigmoid --model_prediction_type raw --guidance_scale 1.0
|
| 108 |
-
```
|
| 109 |
-
|
| 110 |
-
.toml is below.
|
| 111 |
-
```.toml
|
| 112 |
-
[general]
|
| 113 |
-
flip_aug = true
|
| 114 |
-
color_aug = false
|
| 115 |
-
|
| 116 |
-
[[datasets]]
|
| 117 |
-
enable_bucket = true
|
| 118 |
-
resolution = [1024,1024]
|
| 119 |
-
bucket_reso_steps = 64
|
| 120 |
-
max_bucket_reso = 2048
|
| 121 |
-
min_bucket_reso = 128
|
| 122 |
-
bucket_no_upscale = false
|
| 123 |
-
batch_size = 1
|
| 124 |
-
random_crop = false
|
| 125 |
-
shuffle_caption = false
|
| 126 |
-
|
| 127 |
-
[[datasets.subsets]]
|
| 128 |
-
image_dir = "path/to/train/images"
|
| 129 |
-
num_repeats = 1
|
| 130 |
-
caption_extension = ".txt"
|
| 131 |
-
```
|
| 132 |
-
|
| 133 |
-
|
| 134 |
-
## sdxl-negprompt8-v1m.safetensors
|
| 135 |
-
Negative embeddings for sdxl. Num vectors per token = 8
|
| 136 |
-
|
| 137 |
-
## stable-cascade-c-lora-hasui-v02.safetensors
|
| 138 |
-
Sample of LoRA for Stable Cascade Stage C.
|
| 139 |
-
|
| 140 |
-
Feb 22, 2024 Update: Fixed a bug that LoRA is not applied to some modules (to_q/k/v and to_out) in Attention.
|
| 141 |
-
|
| 142 |
-
__This is an experimental model, so the format of the weights may change in the future.__
|
| 143 |
-
|
| 144 |
-
- a painting of an anthropomorphic penguin sitting in a cafe reading a book and having a coffee --w 1024 --h 1024 --d 1
|
| 145 |
-

|
| 146 |
-
|
| 147 |
-
- a painting of japanese shrine in winter with snowfall --w 832 --h 1152 --d 1234
|
| 148 |
-

|
| 149 |
-
|
| 150 |
-
This model is trained with 169 images with captions. U-Net only, dim=4, conv_dim=4, alpha=1, lr=1e-3, 4 epochs, mixed precision bf16, 8bit AdamW, batch size 8, resolution 1024x1024 with aspect ratio bucketing. VRAM usage is approximately 22 GB.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
flux-hasui-lora-d4-sigmoid-raw-gs1.0.safetensors
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:449077320fe1aebcd481e180f629bb97acf80f867e9e1c70fae5654eff72c1a8
|
| 3 |
-
size 38583056
|
|
|
|
|
|
|
|
|
|
|
|
fp-1f-chibi-1024.safetensors
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:5a43b72ed291783c6cf613e9b8651f48a022299404abb7bccaa29b24a453bee3
|
| 3 |
-
size 68984984
|
|
|
|
|
|
|
|
|
|
|
|
fp-1f-kisekae-1024-v4-2-PfPHEMA.safetensors
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:f8f76f612a2d44c5003b7c4a6a33436c20267b38bf9a4256d82c50be1bc98a4b
|
| 3 |
-
size 68984992
|
|
|
|
|
|
|
|
|
|
|
|
fp-1f-kisekae-1024-v4-2.safetensors
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:09c38d6995c815f7672b4121db54b3d75a744f93c4653f9921b4f13443854515
|
| 3 |
-
size 68984992
|
|
|
|
|
|
|
|
|
|
|
|
joyo-kanji-lora-bw-v1-fp16.safetensors
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:c5f8747caa1c7653daf10a90b887c737c7c7547418bce9fc8cd067a75e66432c
|
| 3 |
-
size 155886190
|
|
|
|
|
|
|
|
|
|
|
|
omi-sample-lora-want2v1-3b.safetensors
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:f6003de425f0df92b9e3685575b5092147cdf14510c5b92d91592de95784d917
|
| 3 |
-
size 87594656
|
|
|
|
|
|
|
|
|
|
|
|
penguin.png
DELETED
Git LFS Details
|
qwenimage-blob_emoji-4-s020-6.safetensors
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:754303b63980985afbf7ec54e51383ccf4a3a3f528c998b31b2f5197468d7ee5
|
| 3 |
-
size 74052048
|
|
|
|
|
|
|
|
|
|
|
|
sdxl-negprompt8-v1.safetensors
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:0c6f673292d047973ceab165880d15eb6800c247b749ee93c19b79384f1a27a6
|
| 3 |
-
size 33280
|
|
|
|
|
|
|
|
|
|
|
|
sdxl-negprompt8-v1m.safetensors
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:24350b43a034f2cd3249ba381c555a9487c192d7d6bd83b144821adb5278bdbe
|
| 3 |
-
size 32920
|
|
|
|
|
|
|
|
|
|
|
|
sdxl_pixel_32_v1_ema_300000.safetensors
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:e2d8c6ad85dbc8b4b0390d6e1062fec98e80a8b9877d702aacd5e51b9eb09a34
|
| 3 |
-
size 6557358922
|
|
|
|
|
|
|
|
|
|
|
|
shrine.png
DELETED
Git LFS Details
|
stable-cascade-c-lora-hasui-v01.safetensors
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:53f0bb9f62a9e8287afe033c20a02219e3fed996e6f09f87ed227a6c61aa1e07
|
| 3 |
-
size 19424494
|
|
|
|
|
|
|
|
|
|
|
|
stable-cascade-c-lora-hasui-v02.safetensors
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:c49faf55d75935b512237e737ffb006b3f249b8a63e1b7b42e0074ea925ed139
|
| 3 |
-
size 27911702
|
|
|
|
|
|
|
|
|
|
|
|
wd15b3-bad-v1.safetensors
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:512a7fe6bafc9280d07df227f1c408f17f7e2624c10d0d75b32c1bdf4248cb64
|
| 3 |
-
size 16464
|
|
|
|
|
|
|
|
|
|
|
|
wd15b3-neg-v1.safetensors
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:f25be3cbdef4d90e0297d7fccdc152d2f6c4514ed6b0f14001c8dac2fff07ca6
|
| 3 |
-
size 16464
|
|
|
|
|
|
|
|
|
|
|
|
yellow_blob_1.png
DELETED
Git LFS Details
|
yellow_blob_2.png
DELETED
Git LFS Details
|