ltx-2 / packages /ltx-trainer /docs /troubleshooting.md
linoy
inital commit
ebfc6b3

A newer version of the Gradio SDK is available: 6.3.0

Upgrade

Troubleshooting Guide

This guide covers common issues and solutions when training with the LTX-2 trainer.

πŸ”§ VRAM and Memory Issues

Memory management is crucial for successful training with LTX-2.

Memory Optimization Techniques

1. Enable Gradient Checkpointing

Gradient checkpointing trades training speed for memory savings. Highly recommended for most training runs:

optimization:
  enable_gradient_checkpointing: true

2. Enable 8-bit Text Encoder

Load the Gemma text encoder in 8-bit precision to save GPU memory:

acceleration:
  load_text_encoder_in_8bit: true

3. Reduce Batch Size

Lower the batch size if you encounter out-of-memory errors:

optimization:
  batch_size: 1  # Start with 1 and increase gradually

Use gradient accumulation to maintain a larger effective batch size:

optimization:
  batch_size: 1
  gradient_accumulation_steps: 4  # Effective batch size = 4

4. Use Lower Resolution

Reduce spatial or temporal dimensions to save memory:

# Smaller spatial resolution
uv run python scripts/process_dataset.py dataset.json \
    --resolution-buckets "512x512x49" \
    --model-path /path/to/model.safetensors \
    --text-encoder-path /path/to/gemma

# Fewer frames
uv run python scripts/process_dataset.py dataset.json \
    --resolution-buckets "960x544x25" \
    --model-path /path/to/model.safetensors \
    --text-encoder-path /path/to/gemma

5. Enable Model Quantization

Use quantization to reduce memory usage:

acceleration:
  quantization: "int8-quanto"  # Options: int8-quanto, int4-quanto, fp8-quanto

6. Use 8-bit Optimizer

The 8-bit AdamW optimizer uses less memory:

optimization:
  optimizer_type: "adamw8bit"

⚠️ Common Usage Issues

Issue: "No module named 'ltx_trainer'" Error

Solution: Ensure you've installed the dependencies and are using uv run to execute scripts:

# From the repository root
uv sync
cd packages/ltx-trainer
uv run python scripts/train.py configs/ltx2_av_lora.yaml

Always use uv run to execute Python scripts. This automatically uses the correct virtual environment without requiring manual activation.

Issue: "Gemma model path is not a directory" Error

Solution: The text_encoder_path must point to a directory containing the Gemma model, not a file:

model:
  model_path: "/path/to/ltx-2-model.safetensors"  # File path
  text_encoder_path: "/path/to/gemma-model/"      # Directory path

Issue: "Model path does not exist" Error

Solution: LTX-2 requires local model paths. URLs are not supported:

# βœ… Correct - local path
model:
  model_path: "/path/to/ltx-2-model.safetensors"

# ❌ Wrong - URL not supported
model:
  model_path: "https://huggingface.co/..."

Issue: "Frames must satisfy frames % 8 == 1" Error

Solution: LTX-2 requires the number of frames to satisfy frames % 8 == 1:

  • βœ… Valid: 1, 9, 17, 25, 33, 41, 49, 57, 65, 73, 81, 89, 97, 121
  • ❌ Invalid: 24, 32, 48, 64, 100

Issue: Slow Training Speed

Optimizations:

  1. Disable gradient checkpointing (if you have enough VRAM):

    optimization:
      enable_gradient_checkpointing: false
    
  2. Use torch.compile via Accelerate:

    uv run accelerate launch --config_file configs/accelerate/ddp_compile.yaml \
      scripts/train.py configs/ltx2_av_lora.yaml
    

Issue: Poor Quality Validation Outputs

Solutions:

  1. Use Image-to-Video Validation: For more reliable validation, use image-to-video (first-frame conditioning) rather than pure text-to-video:

    validation:
      prompts:
        - "a professional portrait video of a person"
      images:
        - "/path/to/first_frame.png"  # One image per prompt
    
  2. Increase inference steps:

    validation:
      inference_steps: 50  # Default is 30
    
  3. Adjust guidance settings:

    validation:
      guidance_scale: 3.0  # CFG scale (recommended: 3.0)
      stg_scale: 1.0       # STG scale for temporal coherence (recommended: 1.0)
      stg_blocks: [29]     # Transformer block to perturb
    
  4. Check caption quality: Review and manually edit captions for accuracy if using auto-generated captions. LTX-2 prefers long, detailed captions that describe both visual content and audio (e.g., ambient sounds, speech, music).

  5. Check target modules: Ensure your target_modules configuration matches your training goals. For audio-video training, use patterns that match both branches (e.g., "to_k" instead of "attn1.to_k"). See Understanding Target Modules for details.

  6. Adjust LoRA rank: Try higher values for more capacity:

    lora:
      rank: 64  # Or 128 for more capacity
    
  7. Increase training steps:

    optimization:
      steps: 3000
    

πŸ” Debugging Tools

Monitor GPU Memory Usage

Track memory usage during training:

# Watch GPU memory in real-time
watch -n 1 nvidia-smi

# Log memory usage to file
nvidia-smi --query-gpu=memory.used,memory.total --format=csv --loop=5 > memory_log.csv

Verify Preprocessed Data

Decode latents to visualize the preprocessed videos:

uv run python scripts/decode_latents.py dataset/.precomputed/latents debug_output \
    --model-path /path/to/model.safetensors

To also decode audio latents, add the --with-audio flag:

uv run python scripts/decode_latents.py dataset/.precomputed/latents debug_output \
    --model-path /path/to/model.safetensors \
    --with-audio

Compare decoded videos and audio with originals to ensure quality.


πŸ’‘ Best Practices

Before Training

  • Test preprocessing with a small subset first
  • Verify all video files are accessible
  • Check available GPU memory
  • Review configuration against hardware capabilities
  • Ensure model and text encoder paths are correct

During Training

  • Monitor GPU memory usage
  • Check loss convergence regularly
  • Review validation samples periodically
  • Save checkpoints frequently

After Training

  • Test trained model with diverse prompts
  • Document training parameters and results
  • Archive training data and configs

πŸ†˜ Getting Help

If you're still experiencing issues:

  1. Check logs: Review console output for error details
  2. Search issues: Look through GitHub issues for similar problems
  3. Provide details: When reporting issues, include:
    • Hardware specifications (GPU model, VRAM)
    • Configuration file used
    • Complete error message
    • Steps to reproduce the issue

🀝 Join the Community

Have questions, want to share your results, or need real-time help? Join our community Discord server to connect with other users and the development team!

  • Get troubleshooting help
  • Share your training results and workflows
  • Stay up to date with announcements and updates

We look forward to seeing you there!