File size: 6,200 Bytes
54ed165 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 | # π¬ Local AI Video Generator
Generate AI videos **completely locally** on your computer using CogVideoX-2B model!
## π Features
- β
**100% Local** - No API keys, no cloud services, runs on your computer
- π **CogVideoX-2B** - State-of-the-art text-to-video model by Tsinghua University
- π₯ **6-second videos** - Generate 49 frames at 8 fps (720p quality)
- π» **GPU or CPU** - Works on both (GPU recommended for speed)
- π¨ **Simple UI** - Clean web interface for easy video generation
## π Requirements
### Hardware Requirements
**Minimum (CPU):**
- 16GB RAM
- 10GB free disk space
- Generation time: 5-10 minutes per video
**Recommended (GPU):**
- NVIDIA GPU with 8GB+ VRAM (RTX 3060 or better)
- 16GB RAM
- 10GB free disk space
- Generation time: 30-120 seconds per video
### Software Requirements
- Python 3.9 or higher
- CUDA 11.8+ (for GPU acceleration)
## π Quick Start
### 1. Install Dependencies
```bash
# Install PyTorch with CUDA support (for GPU)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
# Or install PyTorch for CPU only
pip install torch torchvision torchaudio
# Install other requirements
pip install -r requirements_local.txt
```
### 2. Run the Backend
```bash
python backend_local.py
```
The server will start on `http://localhost:5000`
**First Run Notes:**
- The model (~5GB) will be downloaded automatically
- This happens only once
- Subsequent runs will be much faster
### 3. Open the Web Interface
Open `index_local.html` in your browser:
```bash
# On macOS
open index_local.html
# On Linux
xdg-open index_local.html
# On Windows
start index_local.html
```
Or manually open: `http://localhost:5000` and navigate to the HTML file
### 4. Initialize the Model
1. Click the **"π Initialize Model"** button in the UI
2. Wait 2-5 minutes for the model to load
3. Once loaded, you can start generating videos!
### 5. Generate Videos
1. Enter a descriptive prompt (e.g., "A cat playing with a ball of yarn")
2. Click **"π¬ Generate Video"**
3. Wait 30-120 seconds (GPU) or 5-10 minutes (CPU)
4. Download or share your video!
## π Example Prompts
- "A golden retriever running through a field of flowers at sunset"
- "Ocean waves crashing on a beach, aerial view"
- "A bird flying through clouds, slow motion"
- "City street with cars at night, neon lights"
- "Flowers blooming in a garden, time-lapse"
## π― Tips for Best Results
1. **Be Descriptive** - Include details about lighting, camera angle, movement
2. **Keep it Simple** - Focus on one main subject or action
3. **Use Cinematic Terms** - "aerial view", "close-up", "slow motion", etc.
4. **GPU Recommended** - Much faster generation (30-120s vs 5-10min)
5. **First Generation** - May take longer as model initializes
## π§ Troubleshooting
### Model Not Loading
- **Issue**: Model fails to download or load
- **Solution**: Check internet connection, ensure 10GB free disk space
### Out of Memory (GPU)
- **Issue**: CUDA out of memory error
- **Solution**: Close other GPU applications, or use CPU mode
### Slow Generation (CPU)
- **Issue**: Takes 5-10 minutes per video
- **Solution**: This is normal for CPU. Consider using a GPU for faster generation
### Server Won't Start
- **Issue**: Port 5000 already in use
- **Solution**: Change port in `backend_local.py` (line 33): `FLASK_PORT = 5001`
### Video Quality Issues
- **Issue**: Video looks blurry or low quality
- **Solution**: This is expected for the 2B model. For better quality, upgrade to CogVideoX-5B (requires more VRAM)
## π Performance Benchmarks
| Hardware | Model Load Time | Generation Time | Quality |
|----------|----------------|-----------------|---------|
| RTX 4090 | 1-2 min | 30-45 sec | Excellent |
| RTX 3060 | 2-3 min | 60-90 sec | Good |
| CPU (16GB) | 3-5 min | 5-10 min | Good |
## π Model Information
- **Model**: CogVideoX-2B
- **Developer**: Tsinghua University (THUDM)
- **License**: Apache 2.0
- **Size**: ~5GB
- **Output**: 49 frames, 720p, 8 fps (~6 seconds)
## π File Structure
```
hailuo-clone/
βββ backend_local.py # Local backend server
βββ index_local.html # Web interface for local backend
βββ requirements_local.txt # Python dependencies
βββ README_LOCAL.md # This file
βββ generated_videos/ # Output directory (auto-created)
```
## π Comparison with Cloud Backends
| Feature | Local (backend_local.py) | Cloud (backend_enhanced.py) |
|---------|-------------------------|----------------------------|
| Setup | Complex (install PyTorch, download model) | Simple (just API keys) |
| Cost | Free (one-time setup) | Pay per generation |
| Speed | 30-120s (GPU) or 5-10min (CPU) | 30-60s |
| Privacy | 100% private | Data sent to cloud |
| Quality | Good (2B model) | Excellent (5B+ models) |
| Internet | Only for first download | Required for every generation |
## π οΈ Advanced Configuration
### Change Model
Edit `backend_local.py` line 54-56 to use a different model:
```python
# For better quality (requires 16GB+ VRAM)
pipeline = CogVideoXPipeline.from_pretrained(
"THUDM/CogVideoX-5b",
torch_dtype=torch.float16
)
```
### Adjust Generation Parameters
Edit `backend_local.py` lines 126-132:
```python
num_frames = 49 # More frames = longer video
guidance_scale = 6.0 # Higher = more prompt adherence
num_inference_steps = 50 # More steps = better quality (slower)
```
### Pre-load Model on Startup
Uncomment lines 232-233 in `backend_local.py`:
```python
logger.info("Pre-loading model...")
initialize_model()
```
## π Resources
- [CogVideoX GitHub](https://github.com/THUDM/CogVideo)
- [Diffusers Documentation](https://huggingface.co/docs/diffusers)
- [PyTorch Installation](https://pytorch.org/get-started/locally/)
## π€ Support
If you encounter issues:
1. Check the console logs in the terminal
2. Check browser console (F12) for errors
3. Ensure all dependencies are installed correctly
4. Verify GPU drivers are up to date (for GPU mode)
## π License
This project uses CogVideoX-2B which is licensed under Apache 2.0.
---
**Happy Video Generation! π¬β¨**
|