Byte Dream - Text-to-Image Model
Overview
Byte Dream is a production-ready text-to-image diffusion model optimized for CPU inference. It uses CLIP ViT-B/32 for text encoding and a custom UNet architecture for image generation.
Features
- β CPU Optimized: Runs efficiently on CPU (no GPU required)
- β High Quality: Generates 512x512 images
- β Fast Inference: Optimized for speed
- β Easy to Use: Simple Python API and web interface
- β Open Source: MIT License
Installation
pip install torch pillow transformers
git lfs install
git clone https://huggingface.co/Enzo8930302/ByteDream
cd ByteDream
Usage
Quick Start
from bytedream import ByteDreamGenerator
# Load model
generator = ByteDreamGenerator(hf_repo_id="Enzo8930302/ByteDream")
# Generate image
image = generator.generate(
prompt="A beautiful sunset over mountains, digital art",
num_inference_steps=50,
guidance_scale=7.5,
)
image.save("output.png")
Using Cloud API
from bytedream import ByteDreamHFClient
client = ByteDreamHFClient(
repo_id="Enzo8930302/ByteDream",
use_api=True,
)
image = client.generate(
prompt="Futuristic city at night, cyberpunk",
)
image.save("output.png")
Training
Train on your own dataset:
# Create dataset
python create_test_dataset.py
# Train model
python train.py --config config.yaml --train_data dataset
Web Interface
Launch Gradio web interface:
python app.py
Or deploy to Hugging Face Spaces:
python deploy_to_spaces.py --repo_id YourUsername/ByteDream-Space
Model Architecture
- Text Encoder: CLIP ViT-B/32 (512 dimensions)
- UNet: Custom architecture with cross-attention
- VAE: Autoencoder for latent space
- Scheduler: DDIM sampling
Parameters
- Cross-attention dimension: 512
- Block channels: [128, 256, 512, 512]
- Attention heads: 4
- Layers per block: 1
Examples
Prompts that work well:
- "A serene lake at sunset with mountains"
- "Futuristic city with flying cars, cyberpunk"
- "Majestic dragon flying over castle, fantasy"
- "Peaceful garden with cherry blossoms"
Tips:
- Use detailed, descriptive prompts
- Add style keywords (digital art, oil painting, etc.)
- Use negative prompts to avoid unwanted elements
- Higher guidance scale = more faithful to prompt
Files Structure
ByteDream/
βββ bytedream/ # Core package
β βββ __init__.py
β βββ generator.py # Main generator
β βββ model.py # Model architecture
β βββ pipeline.py # Pipeline
β βββ scheduler.py # Scheduler
β βββ hf_api.py # HF API client
β βββ utils.py
βββ train.py # Training script
βββ infer.py # Inference
βββ app.py # Web UI
βββ config.yaml # Config
βββ requirements.txt # Dependencies
Requirements
- Python 3.8+
- PyTorch
- Pillow
- Transformers
- Gradio (for web UI)
See requirements.txt for full list.
License
MIT License
Citation
@software{bytedream2024,
title={Byte Dream: CPU-Optimized Text-to-Image Generation},
year={2024}
}
Links
Support
For issues or questions, please open an issue on GitHub.
Created by Enzo and the Byte Dream Team π¨
- Downloads last month
- 11