| # Byte Dream - Setup Guide | |
| ## Quick Start (Windows) | |
| ### 1. Install Dependencies | |
| #### Option A: Using pip (Recommended) | |
| ```cmd | |
| cd "c:\Users\Enzo\Documents\Byte Dream" | |
| pip install -r requirements.txt | |
| ``` | |
| #### Option B: Using conda | |
| ```cmd | |
| cd "c:\Users\Enzo\Documents\Byte Dream" | |
| conda env create -f environment.yml | |
| conda activate bytedream | |
| ``` | |
| ### 2. Verify Installation | |
| ```cmd | |
| python quick_start.py | |
| ``` | |
| This will check if all dependencies are installed and test the model. | |
| ### 3. Generate Your First Image | |
| #### Command Line | |
| ```cmd | |
| python infer.py --prompt "A beautiful sunset over mountains, digital art" --output sunset.png | |
| ``` | |
| #### Web Interface | |
| ```cmd | |
| python app.py | |
| ``` | |
| Then open http://localhost:7860 in your browser. | |
| #### Python Script | |
| ```python | |
| from bytedream import ByteDreamGenerator | |
| generator = ByteDreamGenerator() | |
| image = generator.generate( | |
| prompt="A cyberpunk city at night with neon lights", | |
| num_inference_steps=50, | |
| guidance_scale=7.5 | |
| ) | |
| image.save("cyberpunk_city.png") | |
| ``` | |
| ## Model Training | |
| ### Prepare Your Dataset | |
| 1. Collect images in a folder (JPG, PNG formats) | |
| 2. Optionally add .txt files with captions for each image | |
| 3. Run preparation script: | |
| ```cmd | |
| python prepare_dataset.py --input ./my_images --output ./processed_data --size 512 | |
| ``` | |
| ### Train the Model | |
| ```cmd | |
| python train.py --train_data ./processed_data --output_dir ./models/bytedream --epochs 100 --batch_size 4 | |
| ``` | |
| Training time depends on: | |
| - Dataset size | |
| - Number of epochs | |
| - CPU speed (expect several hours to days for CPU training) | |
| ## Hugging Face Deployment | |
| ### Upload to Hugging Face Hub | |
| 1. Get your Hugging Face token from https://huggingface.co/settings/tokens | |
| 2. Upload model: | |
| ```cmd | |
| python upload_to_hf.py --model_path ./models/bytedream --repo_id your_username/bytedream --token YOUR_TOKEN | |
| ``` | |
| ### Deploy to Spaces | |
| 1. Create Gradio app file (already included as `app.py`) | |
| 2. Go to https://huggingface.co/spaces | |
| 3. Click "Create new Space" | |
| 4. Choose Gradio SDK | |
| 5. Upload all project files | |
| 6. Select CPU hardware (COSTAR or similar) | |
| 7. Deploy! | |
| ## File Structure | |
| ``` | |
| Byte Dream/ | |
| βββ bytedream/ # Core package | |
| β βββ __init__.py # Package initialization | |
| β βββ model.py # Neural network architectures | |
| β βββ pipeline.py # Generation pipeline | |
| β βββ scheduler.py # Diffusion scheduler | |
| β βββ generator.py # Main generator class | |
| β βββ utils.py # Utility functions | |
| βββ train.py # Training script | |
| βββ infer.py # Command-line inference | |
| βββ app.py # Gradio web interface | |
| βββ main.py # High-level application API | |
| βββ prepare_dataset.py # Dataset preparation | |
| βββ upload_to_hf.py # Hugging Face upload | |
| βββ quick_start.py # Quick start guide | |
| βββ config.yaml # Configuration | |
| βββ requirements.txt # Python dependencies | |
| βββ environment.yml # Conda environment | |
| βββ README.md # Documentation | |
| βββ LICENSE # MIT License | |
| ``` | |
| ## Usage Examples | |
| ### Basic Generation | |
| ```cmd | |
| python infer.py -p "A dragon flying over castle" -o dragon.png | |
| ``` | |
| ### Advanced Parameters | |
| ```cmd | |
| python infer.py -p "Fantasy landscape" -n "ugly, blurry" -W 768 -H 768 -s 75 -g 8.0 --seed 42 | |
| ``` | |
| ### Batch Generation (Python) | |
| ```python | |
| from bytedream import ByteDreamGenerator | |
| generator = ByteDreamGenerator() | |
| prompts = [ | |
| "Sunset beach, palm trees, tropical paradise", | |
| "Mountain landscape, snow peaks, alpine lake", | |
| "Forest path, sunlight filtering through trees" | |
| ] | |
| images = generator.generate_batch( | |
| prompts=prompts, | |
| width=512, | |
| height=512, | |
| num_inference_steps=50 | |
| ) | |
| for i, img in enumerate(images): | |
| img.save(f"landscape_{i}.png") | |
| ``` | |
| ## Performance Optimization | |
| ### CPU Optimization | |
| The model is already optimized for CPU, but you can: | |
| 1. Increase threads in `config.yaml`: | |
| ```yaml | |
| cpu_optimization: | |
| threads: 8 # Set to number of CPU cores | |
| precision: fp32 | |
| ``` | |
| 2. Use fewer inference steps for faster generation: | |
| ```cmd | |
| python infer.py -p "Quick preview" -s 20 | |
| ``` | |
| 3. Generate smaller images: | |
| ```cmd | |
| python infer.py -p "Small image" -W 256 -H 256 | |
| ``` | |
| ### Memory Management | |
| For systems with limited RAM: | |
| 1. Enable memory efficient mode (already default) | |
| 2. Generate one image at a time | |
| 3. Restart Python between batch generations | |
| ## Troubleshooting | |
| ### Import Errors | |
| If you get import errors: | |
| ```cmd | |
| pip install --upgrade torch transformers diffusers | |
| ``` | |
| ### Memory Errors | |
| Reduce image size or inference steps: | |
| ```cmd | |
| python infer.py -p "Test" -W 256 -H 256 -s 20 | |
| ``` | |
| ### Slow Generation | |
| CPU generation is slower than GPU. Expect: | |
| - 256x256: ~30-60 seconds | |
| - 512x512: ~2-5 minutes | |
| - 768x768: ~5-10 minutes | |
| Times vary by CPU speed and number of steps. | |
| ### Model Not Loading | |
| The model needs trained weights. Either: | |
| 1. Train your own model using `train.py` | |
| 2. Download pretrained weights from Hugging Face | |
| 3. Use Stable Diffusion weights as base | |
| ## Tips for Better Results | |
| ### Writing Prompts | |
| - Be specific and descriptive | |
| - Include style references ("digital art", "oil painting") | |
| - Mention lighting ("dramatic lighting", "soft sunlight") | |
| - Add quality modifiers ("highly detailed", "4K", "masterpiece") | |
| ### Negative Prompts | |
| Use to avoid common issues: | |
| ``` | |
| ugly, blurry, low quality, distorted, deformed, bad anatomy, extra limbs | |
| ``` | |
| ### Parameters | |
| - **Steps**: 20-30 (quick), 50 (good), 75-100 (best) | |
| - **Guidance**: 5-7 (creative), 7-9 (balanced), 9-12 (strict) | |
| - **Resolution**: Start with 512x512, increase if needed | |
| ## Advanced Features | |
| ### Custom Schedulers | |
| Edit `config.yaml` to try different schedulers: | |
| - DDIM (default) - Fast, deterministic | |
| - EulerDiscrete - Alternative sampling | |
| ### Fine-tuning | |
| Fine-tune on specific styles: | |
| 1. Collect 50-100 images in desired style | |
| 2. Prepare dataset | |
| 3. Train for 50-100 epochs with low learning rate (1e-6) | |
| ## Support | |
| For issues and questions: | |
| 1. Check this guide first | |
| 2. Review README.md | |
| 3. Check code comments | |
| 4. Visit Hugging Face documentation | |
| ## Updates | |
| Check for updates and improvements: | |
| - New model architectures | |
| - Better CPU optimization | |
| - Additional features | |
| - Bug fixes | |
| Enjoy creating with Byte Dream! π¨ | |