metadata
license: mit
tags:
- lora
- training
- runpod
- ai-toolkit
AI Trainer - RunPod Serverless
Single-endpoint multi-model LoRA training service using ai-toolkit.
Automatically cleans up GPU memory when switching between different models.
Supported Models
| Model Key | Description | Base Model |
|---|---|---|
| wan21_1b | Wan2.1 1.3B Video | Wan-AI/Wan2.1-T2V-1.3B-Diffusers |
| wan21_14b | Wan2.1 14B Video | Wan-AI/Wan2.1-T2V-14B-Diffusers |
| wan22_14b | Wan2.2 14B Video | ai-toolkit/Wan2.2-T2V-A14B-Diffusers-bf16 |
| qwen_image | Qwen Image Gen | Qwen/Qwen-Image |
| qwen_image_edit | Qwen Image Edit | Qwen/Qwen-Image-Edit |
| flux_dev | FLUX.1 Dev | black-forest-labs/FLUX.1-dev |
| flux_schnell | FLUX.1 Schnell | black-forest-labs/FLUX.1-schnell |
API Usage
List Models
{"input": {"action": "list_models"}}
Check Status
{"input": {"action": "status"}}
Manual Cleanup
{"input": {"action": "cleanup"}}
Train LoRA
{
"input": {
"action": "train",
"model": "flux_dev",
"params": {
"dataset_path": "/workspace/dataset",
"output_path": "/workspace/output",
"steps": 1000,
"batch_size": 1,
"learning_rate": 1e-4,
"lora_rank": 16
}
}
}
Training Parameters
| Parameter | Description | Default |
|---|---|---|
| dataset_path | Path to training images | /workspace/dataset |
| output_path | Output directory | /workspace/output |
| steps | Training steps | 2000 |
| batch_size | Batch size | 1 |
| learning_rate | Learning rate | 1e-4 |
| lora_rank | LoRA rank | 16-32 |
| save_every | Save checkpoint interval | 250 |
| sample_every | Sample generation interval | 250 |
| trigger_word | Trigger word for training | None |
RunPod Deployment
Environment Variables
HF_TOKEN: HuggingFace token for gated models (required for FLUX, Qwen)
Model Caching
Models are cached at /runpod-volume/huggingface-cache/hub/ for faster subsequent loads.
For optimal cold starts, set the RunPod Model field to one of:
black-forest-labs/FLUX.1-dev(for FLUX training)ai-toolkit/Wan2.2-T2V-A14B-Diffusers-bf16(for Wan 2.2)Qwen/Qwen-Image(for Qwen Image)
Auto-Cleanup
The handler automatically cleans up GPU memory when switching between models:
- Full cleanup when changing model types
- Light cleanup for same model
- Manual cleanup via
cleanupaction