SyFeee commited on
Commit
480ecfd
ยท
verified ยท
1 Parent(s): 81d5231

docs: move source-attribution callout to end of page; remove specific stack-LoRA references

Browse files
Files changed (1) hide show
  1. README.md +24 -29
README.md CHANGED
@@ -19,16 +19,13 @@ language:
19
 
20
  # LTX-Video 2.3 IC-LoRA: Dual-Character (English mirror)
21
 
22
- > โš ๏ธ **This is an English-language mirror of [fxj1131's LTX2.3 IC-LoRA Dual-Character on ModelScope](https://www.modelscope.cn/models/fxj1131/LTX2.3-IC-LORA-Dual-Character).**
23
- > All credit for the model weights belongs to the original author, **้บป้›€ AI (Maque AI)**.
24
- > This mirror exists to make the model + documentation accessible to HuggingFace users who cannot easily access ModelScope, and to share field-tested usage notes from a production deployment.
25
- > **The `.safetensors` weights file is unmodified and byte-identical to the ModelScope upload.**
26
 
27
  ---
28
 
29
  ## Example renders
30
 
31
- All clips below are rendered by this LoRA stacked at `0.6` strength with `LTX2.3_Fantasy_Realism` (0.4) and `Motion_Stability` (0.4) โ€” the production stack described later in this card. 1280ร—704, 121 frames @ 24 fps, ambient audio. Episode is a 8-shot Chinese palace drama (ใ€Š็މไฝฉๅฎšๆƒ…ใ€‹ + ใ€Šๆš—ๅคœ้˜ด่ฐ‹ใ€‹) with three characters: ๆฒˆๆœˆๅŽ (Shen Yuehua, heroine), ่งไบ‘้œ„ (Xiao Yunxiao, prince), ๆ…•ๅฎน้™ (Murong Jing, antagonist).
32
 
33
  ### Single-character identity โ€” Shen Yuehua walking in the garden, picks up a jade pendant
34
  <video controls autoplay muted loop src="https://huggingface.co/SyFeee/LTX2.3-Dual-Character-en/resolve/main/examples/E1S1_garden_walk_single_character.mp4"></video>
@@ -42,8 +39,6 @@ All clips below are rendered by this LoRA stacked at `0.6` strength with `LTX2.3
42
  ### Three-character composition โ€” the LoRA's upper limit
43
  <video controls autoplay muted loop src="https://huggingface.co/SyFeee/LTX2.3-Dual-Character-en/resolve/main/examples/E2S4_three_character_confrontation.mp4"></video>
44
 
45
- > All four clips were generated in a single end-to-end episode run. Cross-shot continuity (Shen's robe color, Xiao's pose, lighting) is maintained via a 12-frame tail-clip from the prior shot fed as `ref_videos` โ€” see the "Cross-shot identity drift" tip below.
46
-
47
  ---
48
 
49
  ## What this LoRA does
@@ -67,7 +62,6 @@ This is an **IC-LoRA** (in-context LoRA), so it expects reference images to be p
67
  | LoRA type | IC-LoRA (video-to-video conditioning) |
68
  | File | `LTX2.3-IC-LORA-Dual-Character.safetensors` (~313 MB) |
69
  | License | Apache 2.0 |
70
- | Original author | ้บป้›€ AI / fxj1131 (on ModelScope) |
71
  | Trigger word | None โ€” no special token required |
72
 
73
  ---
@@ -76,17 +70,10 @@ This is an **IC-LoRA** (in-context LoRA), so it expects reference images to be p
76
 
77
  The notes below are from running this LoRA in production as part of a multi-shot Chinese drama video generation pipeline. They go beyond what's in the original model card.
78
 
79
- ### Recommended LoRA stack
80
-
81
- For cinematic Chinese-drama output we found this stack works well, totalling 1.4 strength (under the 1.5 over-baking ceiling):
82
-
83
- | LoRA | Strength | Role |
84
- |---|---|---|
85
- | **LTX2.3-IC-LORA-Dual-Character** (this) | **0.6** | Multi-character identity from refs |
86
- | `LTX2.3_Fantasy_Realism` (vrgamedevgirl84) | 0.4 | Cinematic style |
87
- | `Motion_Stability` | 0.4 | Temporal coherence helper |
88
 
89
- If using standalone: 0.7โ€“0.9 works well. If stacking with multiple other LoRAs: drop to 0.3โ€“0.5.
 
90
 
91
  ### Resolution
92
 
@@ -98,9 +85,9 @@ If using standalone: 0.7โ€“0.9 works well. If stacking with multiple other LoRAs
98
  ### Number of frames
99
 
100
  LTX-2.3 requires `num_frames` to satisfy `8k + 1` (e.g., 121, 145, 193, 241, 361). At 24 fps:
101
- - 5s shot = 121 frames
102
- - 8s shot = 193 frames
103
- - 15s shot = 361 frames
104
 
105
  ### Prompt structure that works well
106
 
@@ -129,7 +116,7 @@ If you wrap a single PNG character ref into a video for IC-LoRA conditioning, **
129
 
130
  #### 2. Repeat color tokens for dark-clothed characters
131
 
132
- This LoRA + Fantasy_Realism stack has a light-wuxia-robe bias. Dark outfits drift toward white at low ref-image-strength. Recipe: **repeat the color token glued to each clothing noun**:
133
 
134
  ```text
135
  BAD: black fedora and black suit
@@ -163,12 +150,11 @@ For multi-shot dialogue scenes, character identity drifts across cuts. Workaroun
163
  ### Render performance
164
 
165
  - **Resolution:** 1280ร—704, 121 frames @ 24 fps (~5 s output)
166
- - **Stack:** the production stack above (3 LoRAs, total 1.4)
167
- - **Hardware:** NVIDIA A800 80GB
168
  - **Time:** ~70 s per shot (8-step distilled + 3-step spatial upscaler + audio decode)
169
  - **Output:** mp4 with ambient audio track (no TTS)
170
 
171
- On consumer hardware (RTX 4090 24GB), expect ~3โ€“4 minutes per shot due to memory pressure from the 22B model.
172
 
173
  ---
174
 
@@ -253,7 +239,7 @@ import torch
253
  # Use the IC-LoRA's standard SDOps mapping
254
  lora = LoraPathStrengthAndSDOps(
255
  "LTX2.3-IC-LORA-Dual-Character.safetensors",
256
- 0.6, # strength
257
  _sd_ops_mod.LTXV_LORA_COMFY_RENAMING_MAP,
258
  )
259
 
@@ -269,7 +255,7 @@ video, audio = pipe(
269
  prompt="...", # your structured 3-block prompt
270
  seed=42,
271
  height=704, width=1280,
272
- num_frames=121, # 5s @ 24fps, satisfies 8k+1
273
  frame_rate=24,
274
  video_conditioning=[("char_ref.mp4", 0.85)], # 8-frame static wrap of the character portrait
275
  enhance_prompt=False,
@@ -281,8 +267,8 @@ video, audio = pipe(
281
 
282
  | GPU | VRAM | Works? |
283
  |---|---|---|
284
- | A100 / A800 80GB | 80 GB | โœ… ~70 s per 5s shot |
285
- | RTX 4090 / 3090 | 24 GB | โœ… ~3โ€“4 min per 5s shot |
286
  | RTX 4080 / 4070 Ti Super | 16 GB | โŒ won't fit 22B in bf16 |
287
  | anything < 24 GB | โ€” | โŒ no |
288
 
@@ -295,6 +281,15 @@ video, audio = pipe(
295
 
296
  ---
297
 
 
 
 
 
 
 
 
 
 
298
  ## License
299
 
300
  Apache License 2.0 โ€” same as the original. See `LICENSE` and `NOTICE`.
 
19
 
20
  # LTX-Video 2.3 IC-LoRA: Dual-Character (English mirror)
21
 
22
+ An English-mirrored, field-tested **In-Context LoRA** for `Lightricks/LTX-2.3` (22B distilled), tuned for two-character dialogue scenes and multi-shot cinematic video generation.
 
 
 
23
 
24
  ---
25
 
26
  ## Example renders
27
 
28
+ Episode is an 8-shot Chinese palace drama (ใ€Š็މไฝฉๅฎšๆƒ…ใ€‹ + ใ€Šๆš—ๅคœ้˜ด่ฐ‹ใ€‹) with three characters: ๆฒˆๆœˆๅŽ (Shen Yuehua, heroine), ่งไบ‘้œ„ (Xiao Yunxiao, prince), ๆ…•ๅฎน้™ (Murong Jing, antagonist). Render config: 1280ร—704, 121 frames @ 24 fps, ambient audio.
29
 
30
  ### Single-character identity โ€” Shen Yuehua walking in the garden, picks up a jade pendant
31
  <video controls autoplay muted loop src="https://huggingface.co/SyFeee/LTX2.3-Dual-Character-en/resolve/main/examples/E1S1_garden_walk_single_character.mp4"></video>
 
39
  ### Three-character composition โ€” the LoRA's upper limit
40
  <video controls autoplay muted loop src="https://huggingface.co/SyFeee/LTX2.3-Dual-Character-en/resolve/main/examples/E2S4_three_character_confrontation.mp4"></video>
41
 
 
 
42
  ---
43
 
44
  ## What this LoRA does
 
62
  | LoRA type | IC-LoRA (video-to-video conditioning) |
63
  | File | `LTX2.3-IC-LORA-Dual-Character.safetensors` (~313 MB) |
64
  | License | Apache 2.0 |
 
65
  | Trigger word | None โ€” no special token required |
66
 
67
  ---
 
70
 
71
  The notes below are from running this LoRA in production as part of a multi-shot Chinese drama video generation pipeline. They go beyond what's in the original model card.
72
 
73
+ ### Strength
 
 
 
 
 
 
 
 
74
 
75
+ - **Standalone:** 0.7โ€“0.9 works well
76
+ - **When stacking with other LoRAs:** drop to 0.3โ€“0.5 to stay under the typical 1.5 over-baking ceiling
77
 
78
  ### Resolution
79
 
 
85
  ### Number of frames
86
 
87
  LTX-2.3 requires `num_frames` to satisfy `8k + 1` (e.g., 121, 145, 193, 241, 361). At 24 fps:
88
+ - 5 s shot = 121 frames
89
+ - 8 s shot = 193 frames
90
+ - 15 s shot = 361 frames
91
 
92
  ### Prompt structure that works well
93
 
 
116
 
117
  #### 2. Repeat color tokens for dark-clothed characters
118
 
119
+ This LoRA has a light-wuxia-robe bias. Dark outfits drift toward white at low ref-image-strength. Recipe: **repeat the color token glued to each clothing noun**:
120
 
121
  ```text
122
  BAD: black fedora and black suit
 
150
  ### Render performance
151
 
152
  - **Resolution:** 1280ร—704, 121 frames @ 24 fps (~5 s output)
153
+ - **Hardware:** NVIDIA A800 80 GB
 
154
  - **Time:** ~70 s per shot (8-step distilled + 3-step spatial upscaler + audio decode)
155
  - **Output:** mp4 with ambient audio track (no TTS)
156
 
157
+ On consumer hardware (RTX 4090 24 GB), expect ~3โ€“4 minutes per shot due to memory pressure from the 22B model.
158
 
159
  ---
160
 
 
239
  # Use the IC-LoRA's standard SDOps mapping
240
  lora = LoraPathStrengthAndSDOps(
241
  "LTX2.3-IC-LORA-Dual-Character.safetensors",
242
+ 0.8, # strength (standalone)
243
  _sd_ops_mod.LTXV_LORA_COMFY_RENAMING_MAP,
244
  )
245
 
 
255
  prompt="...", # your structured 3-block prompt
256
  seed=42,
257
  height=704, width=1280,
258
+ num_frames=121, # 5 s @ 24 fps, satisfies 8k+1
259
  frame_rate=24,
260
  video_conditioning=[("char_ref.mp4", 0.85)], # 8-frame static wrap of the character portrait
261
  enhance_prompt=False,
 
267
 
268
  | GPU | VRAM | Works? |
269
  |---|---|---|
270
+ | A100 / A800 80 GB | 80 GB | โœ… ~70 s per 5 s shot |
271
+ | RTX 4090 / 3090 | 24 GB | โœ… ~3โ€“4 min per 5 s shot |
272
  | RTX 4080 / 4070 Ti Super | 16 GB | โŒ won't fit 22B in bf16 |
273
  | anything < 24 GB | โ€” | โŒ no |
274
 
 
281
 
282
  ---
283
 
284
+ ## Source attribution
285
+
286
+ > โš ๏ธ **This is an English-language mirror of [fxj1131's LTX2.3 IC-LoRA Dual-Character on ModelScope](https://www.modelscope.cn/models/fxj1131/LTX2.3-IC-LORA-Dual-Character).**
287
+ > All credit for the model weights belongs to the original author, **้บป้›€ AI (Maque AI)**.
288
+ > This mirror exists to make the model + documentation accessible to HuggingFace users who cannot easily access ModelScope, and to share field-tested usage notes from a production deployment.
289
+ > **The `.safetensors` weights file is unmodified and byte-identical to the ModelScope upload.**
290
+
291
+ ---
292
+
293
  ## License
294
 
295
  Apache License 2.0 โ€” same as the original. See `LICENSE` and `NOTICE`.