valiantcat LoRA for LTX-2.3

This LoRA is trained on top of Lightricks/LTX-2.3 and is built with a custom training paradigm tailored for high-consistency video generation.

It was originally optimized for first-frame / last-frame guided transition videos, but the same training strategy also gives it strong generalization in:

First-to-last frame video generation
Text-to-video
Image-to-video
Stylized transformation and scene transition generation

Compared with a narrowly tuned transition LoRA, this version focuses more on motion continuity, semantic stability, prompt responsiveness, and cross-scene transformation quality, so it remains effective even outside the strict start-end frame setup.

Prompt: A medium close-up captures a young woman with wavy brown hair and a soft smile in a sun-drenched wooden room, her hands delicately clasped over a white embroidered sweater. As the camera slowly dollies forward, a seamless transformation occurs where her facial features gradually sharpen and her hair shortens into a dark, textured style, morphing her into a young man with a slight, knowing smirk. The delicate floral patterns on her garment shift and merge into a dark leaf motif on a beige knit sweater, maintaining the soft, warm lighting that highlights the grain of the wooden wall and the green leaves of the plant behind them. His hands reposition naturally during the transition, with one hand forming a loose fist near his chest as the movement settles into a calm, steady gaze toward the lens. The ambient sound of a gentle breeze rustling nearby indoor plants and the faint, rhythmic ticking of a wall clock fills the quiet space, emphasizing the fluid and magical nature of the character's evolution, zhuanchang

Prompt: A low-angle wide shot establishes a winding, wet asphalt road flanked by a dense, dark forest where heavy fog clings to the mossy tree trunks. The glistening surface of the road reflects the dim, moody light, highlighting the vibrant double yellow lines that curve into the misty distance. The camera glides forward smoothly at a low height, tracking the damp texture of the pavement as droplets of moisture fall from the overhanging emerald canopy. Suddenly, the camera tilts upward and accelerates, piercing through the thick, grey veil of the forest ceiling and ascending rapidly into a dense layer of rolling white clouds. As the camera breaks through the cloud deck, it reveals a breathtaking vista of a sharp, snow-dusted mountain peak piercing a brilliant, clear blue sky. The jagged rock textures and icy ridges of the summit are illuminated by crisp, high-altitude sunlight while soft clouds drift slowly around the mountain's base. The ambient sound begins with the rhythmic hum of tires on wet pavement and the soft dripping of water, transitioning into a deep, ethereal wind howl as the camera reaches the silent majesty of the peak, zhuanchang

Prompt: A medium shot captures Squidward, a teal, stylized 3D cephalopod with a furrowed brow and drooping eyelids, standing in a dimly lit blue room while clutching a dark clarinet. A warm glow from a music stand illuminates his disgruntled expression as he brings the instrument to his mouth, his fingers twitching over the keys in a hesitant, stiff rhythm. The camera suddenly dollies back rapidly and whips to the right, blurring the bamboo-textured walls and cool shadows into a streak of vibrant motion. This transition reveals a bright, warm-toned diner booth where SpongeBob, a porous yellow sponge with wide, sparkling eyes and a massive toothy grin, sits leaning forward. SpongeBob holds a thick, glistening hamburger with both hands, his shoulders bouncing up and down as he prepares to take a large, enthusiastic bite. The table is crowded with a tall pink milkshake topped with whipped cream and a carton of golden fries, all bathed in the soft, amber light of a vintage filament bulb hanging above. The audio begins with a single, discordant clarinet squeak that abruptly cuts into the ambient clinking of silverware and the low, energetic hum of a busy restaurant, zhuanchang

Prompt: In a medium shot within a brightly lit, modern office, a blonde woman with a high ponytail stands leaning against a desk, her form-fitting grey-blue blouse and light grey skirt catching the clean overhead light. A seamless transformation begins as fine orange fur ripples across her skin, spreading from her shoulders down her arms and up her neck. Her human features soften and stretch; her nose narrows into a dark, wet snout while her ears elongate and point upwards, piercing through her blonde hair as they become large, fur-lined fox ears. The camera pushes in slowly, focusing on the intricate texture of the fur and her widening amber eyes that blink with a curious intensity. Her hands shift into dark-furred paws and a thick, bushy tail with a white tip unfurls from behind her skirt, swaying gently. The ambient hum of the office is punctuated by the soft rustle of her shifting clothes and the quiet sound of her nose twitching as she fully settles into her anthropomorphic fox form, zhuanchang

Prompt: In a medium shot, a woman with long, wavy brown hair and a black tank top laughs under the dappled sunlight of an olive tree in a lush green garden. Her laughter gradually subsides into a serene, focused gaze as she slowly brings her palms together in a prayer position at the center of her chest. As her hands touch, the natural daylight rapidly dims into a deep twilight, and the realistic textures of the garden morph into a sharp, vibrant anime aesthetic. Glowing golden runes begin to etch themselves onto her forehead, neck, and arms, casting a warm yellow radiance against her skin. Her eyes shift from a natural hue to a piercing, luminous red as a circular halo of ancient symbols ignites behind her head. The camera zooms in slowly, revealing an ornate traditional temple courtyard with dark tiled roofs emerging from the shadows behind her. The sound of rustling leaves transitions into a resonant, low-frequency hum of magical energy accompanied by the distant, heavy chime of a temple bell,zhuanchang

Prompt: A medium close-up captures a young woman with long brown hair and bangs, wearing a delicate white semi-sheer blouse, as she sits against a warm wooden-paneled wall bathed in soft, diffused indoor light. Her neutral expression suddenly breaks as her eyes widen and her jaw drops, her hands flying up to clutch her cheeks in a physical display of intense surprise. The camera zooms in tight on her face while the realistic textures of her skin and clothing rapidly morph into vibrant, hand-drawn comic book aesthetics characterized by thick black outlines and flat cel-shaded colors. As the camera pulls back sharply, the single frame shatters into a four-panel comic grid, revealing the character now wearing a striped navy and tan polo shirt with her hair pulled into a high, neat bun. In the top panels, she gasps with her hands behind her head and then smiles radiantly with palms pressed to her face, while the bottom panels show her brow furrowing into a deep scowl and a pouty, frustrated grimace. She cries out, "Is this real life?" in a frantic, melodic tone, accompanied by the sharp sound of a page-turning and a whimsical pop. A heart-shaped frame emerges in the center of the grid, locking the character's realistic likeness into the stylized collage as the background pulses with energetic, radiating action lines, zhuanchang

Overview

LTX-2.3 is a strong base for controllable video generation, with improved visual quality and prompt adherence. On top of this foundation, this LoRA further enhances transformation-style motion, visual coherence, and scene-to-scene continuity.

The result is a practical LoRA that can handle both:

precise transition tasks driven by start and end frames
open-ended prompt-driven generation
image-conditioned motion generation

Core Strengths

Excellent first-last frame transition quality
Produces smoother semantic and visual interpolation between two target states, reducing abrupt jumps and broken motion.
Works beyond transition-only scenarios
Even without explicit start/end frame constraints, it performs well in text-to-video and image-to-video generation.
Custom training paradigm
Trained with a dedicated methodology designed to improve controllability, temporal coherence, and subject consistency across changing scenes.
Strong prompt adaptability
Handles character change, style morphing, object transformation, scene switching, and cinematic motion prompts well.
Wide subject coverage
Effective on humans, animals, animation characters, environments, and mixed-concept prompts.

Best Use Cases

This LoRA is especially suitable for the following workflows:

First-to-last frame generation Smoothly bridge two highly different frames while preserving motion logic and visual readability.
Text-to-video generation Improve dynamic transformation prompts, scene evolution prompts, and narrative transition prompts.
Image-to-video generation Add stronger motion intent and more expressive transformation capability to single-image driven video generation.
Creative transition design Useful for transformation clips, cinematic cuts, identity morphs, object swaps, and surreal scene transitions.

Model File

File	Recommended Strength (alpha)
`ltx2.3-transition.safetensors`	`1.0`

Recommended Settings

Setting	Value
LoRA Strength	`1.0`
Embedded Guidance Scale	`1.0`
Classifier Free Guidance	`4.0`

You can start from the settings above and then make small adjustments depending on:

how strong you want the transition effect to be
whether the prompt is more cinematic or more literal
whether the task is first-last frame, text-to-video, or image-to-video

Trigger Word

Recommended trigger phrase:

zhuanchang

If needed, place the trigger word near the end of the prompt so the base prompt still clearly describes:

subject
scene
camera movement
transformation behavior
atmosphere

Prompting Guide

For best results, prompts should usually contain:

Shot description
Example: close-up, medium shot, wide shot, low-angle, tracking shot.
Subject and environment
Describe the character, object, or scene as clearly as possible.
Motion or transformation process
Explain what changes over time: identity, style, object form, scene layout, or camera trajectory.
Visual details
Add texture, lighting, color, material, and spatial cues.
Ending trigger
Add zhuanchang when you want the LoRA behavior to activate more strongly.

Prompt Template

[shot type and camera language]. [subject and scene description]. [describe the motion, transformation, or transition process in detail]. [add lighting, texture, atmosphere, and composition cues]. zhuanchang

Example Prompt

A low-angle wide shot establishes a winding, wet asphalt road flanked by a dense, dark forest where heavy fog clings to the mossy tree trunks. The glistening surface of the road reflects the dim, moody light, highlighting the vibrant double yellow lines that curve into the misty distance. The camera glides forward smoothly at a low height, tracking the damp texture of the pavement as droplets of moisture fall from the overhanging emerald canopy. Suddenly, the camera tilts upward and accelerates, piercing through the thick, grey veil of the forest ceiling and ascending rapidly into a dense layer of rolling white clouds. As the camera breaks through the cloud deck, it reveals a breathtaking vista of a sharp, snow-dusted mountain peak piercing a brilliant, clear blue sky. The jagged rock textures and icy ridges of the summit are illuminated by crisp, high-altitude sunlight while soft clouds drift slowly around the mountain's base. zhuanchang

Notes

This LoRA is trained on the LTX-2.3 foundation and is intended to complement the base model rather than replace prompt quality.
Best results usually come from clear temporal instructions instead of short keyword-only prompts.
For transition-heavy tasks, stronger scene descriptions and explicit motion language generally improve stability.
For text-to-video and image-to-video tasks, keeping the prompt visually focused usually leads to better composition and cleaner motion.

ComfyUI Workflow

This LoRA works with a modified version of Kijai's LTX-2.3-Transition-LORA workflow. The main modification is adding a LTX-2.3-Transition-LORA node connected to the base model.

See the Downloads section above for the modified workflow.

Learn How to Use This Model

👉 Click here to watch the full video tutorial 👈

Download model

Weights for this model are available in Safetensors format.

Download

Training at Chongqing Valiant Cat

This model was trained by the AI Laboratory of Chongqing Valiant Cat Technology Co., LTD(https://vvicat.com/).Business cooperation is welcome

Downloads last month: 205

Model tree for valiantcat/LTX-2.3-Transition-LORA

Base model

Lightricks/LTX-2.3

Adapter

(5)

this model