Spaces:
Running
on
Zero
A newer version of the Gradio SDK is available:
6.3.0
Troubleshooting Guide
This guide covers common issues and solutions when training with the LTX-2 trainer.
π§ VRAM and Memory Issues
Memory management is crucial for successful training with LTX-2.
Memory Optimization Techniques
1. Enable Gradient Checkpointing
Gradient checkpointing trades training speed for memory savings. Highly recommended for most training runs:
optimization:
enable_gradient_checkpointing: true
2. Enable 8-bit Text Encoder
Load the Gemma text encoder in 8-bit precision to save GPU memory:
acceleration:
load_text_encoder_in_8bit: true
3. Reduce Batch Size
Lower the batch size if you encounter out-of-memory errors:
optimization:
batch_size: 1 # Start with 1 and increase gradually
Use gradient accumulation to maintain a larger effective batch size:
optimization:
batch_size: 1
gradient_accumulation_steps: 4 # Effective batch size = 4
4. Use Lower Resolution
Reduce spatial or temporal dimensions to save memory:
# Smaller spatial resolution
uv run python scripts/process_dataset.py dataset.json \
--resolution-buckets "512x512x49" \
--model-path /path/to/model.safetensors \
--text-encoder-path /path/to/gemma
# Fewer frames
uv run python scripts/process_dataset.py dataset.json \
--resolution-buckets "960x544x25" \
--model-path /path/to/model.safetensors \
--text-encoder-path /path/to/gemma
5. Enable Model Quantization
Use quantization to reduce memory usage:
acceleration:
quantization: "int8-quanto" # Options: int8-quanto, int4-quanto, fp8-quanto
6. Use 8-bit Optimizer
The 8-bit AdamW optimizer uses less memory:
optimization:
optimizer_type: "adamw8bit"
β οΈ Common Usage Issues
Issue: "No module named 'ltx_trainer'" Error
Solution:
Ensure you've installed the dependencies and are using uv run to execute scripts:
# From the repository root
uv sync
cd packages/ltx-trainer
uv run python scripts/train.py configs/ltx2_av_lora.yaml
Always use
uv runto execute Python scripts. This automatically uses the correct virtual environment without requiring manual activation.
Issue: "Gemma model path is not a directory" Error
Solution:
The text_encoder_path must point to a directory containing the Gemma model, not a file:
model:
model_path: "/path/to/ltx-2-model.safetensors" # File path
text_encoder_path: "/path/to/gemma-model/" # Directory path
Issue: "Model path does not exist" Error
Solution: LTX-2 requires local model paths. URLs are not supported:
# β
Correct - local path
model:
model_path: "/path/to/ltx-2-model.safetensors"
# β Wrong - URL not supported
model:
model_path: "https://huggingface.co/..."
Issue: "Frames must satisfy frames % 8 == 1" Error
Solution:
LTX-2 requires the number of frames to satisfy frames % 8 == 1:
- β Valid: 1, 9, 17, 25, 33, 41, 49, 57, 65, 73, 81, 89, 97, 121
- β Invalid: 24, 32, 48, 64, 100
Issue: Slow Training Speed
Optimizations:
Disable gradient checkpointing (if you have enough VRAM):
optimization: enable_gradient_checkpointing: falseUse torch.compile via Accelerate:
uv run accelerate launch --config_file configs/accelerate/ddp_compile.yaml \ scripts/train.py configs/ltx2_av_lora.yaml
Issue: Poor Quality Validation Outputs
Solutions:
Use Image-to-Video Validation: For more reliable validation, use image-to-video (first-frame conditioning) rather than pure text-to-video:
validation: prompts: - "a professional portrait video of a person" images: - "/path/to/first_frame.png" # One image per promptIncrease inference steps:
validation: inference_steps: 50 # Default is 30Adjust guidance settings:
validation: guidance_scale: 3.0 # CFG scale (recommended: 3.0) stg_scale: 1.0 # STG scale for temporal coherence (recommended: 1.0) stg_blocks: [29] # Transformer block to perturbCheck caption quality: Review and manually edit captions for accuracy if using auto-generated captions. LTX-2 prefers long, detailed captions that describe both visual content and audio (e.g., ambient sounds, speech, music).
Check target modules: Ensure your
target_modulesconfiguration matches your training goals. For audio-video training, use patterns that match both branches (e.g.,"to_k"instead of"attn1.to_k"). See Understanding Target Modules for details.Adjust LoRA rank: Try higher values for more capacity:
lora: rank: 64 # Or 128 for more capacityIncrease training steps:
optimization: steps: 3000
π Debugging Tools
Monitor GPU Memory Usage
Track memory usage during training:
# Watch GPU memory in real-time
watch -n 1 nvidia-smi
# Log memory usage to file
nvidia-smi --query-gpu=memory.used,memory.total --format=csv --loop=5 > memory_log.csv
Verify Preprocessed Data
Decode latents to visualize the preprocessed videos:
uv run python scripts/decode_latents.py dataset/.precomputed/latents debug_output \
--model-path /path/to/model.safetensors
To also decode audio latents, add the --with-audio flag:
uv run python scripts/decode_latents.py dataset/.precomputed/latents debug_output \
--model-path /path/to/model.safetensors \
--with-audio
Compare decoded videos and audio with originals to ensure quality.
π‘ Best Practices
Before Training
- Test preprocessing with a small subset first
- Verify all video files are accessible
- Check available GPU memory
- Review configuration against hardware capabilities
- Ensure model and text encoder paths are correct
During Training
- Monitor GPU memory usage
- Check loss convergence regularly
- Review validation samples periodically
- Save checkpoints frequently
After Training
- Test trained model with diverse prompts
- Document training parameters and results
- Archive training data and configs
π Getting Help
If you're still experiencing issues:
- Check logs: Review console output for error details
- Search issues: Look through GitHub issues for similar problems
- Provide details: When reporting issues, include:
- Hardware specifications (GPU model, VRAM)
- Configuration file used
- Complete error message
- Steps to reproduce the issue
π€ Join the Community
Have questions, want to share your results, or need real-time help? Join our community Discord server to connect with other users and the development team!
- Get troubleshooting help
- Share your training results and workflows
- Stay up to date with announcements and updates
We look forward to seeing you there!