ByteDream / README.md
Enzo8930302's picture
Upload README.md with huggingface_hub
4bd664b verified
---
license: mit
language: en
tags:
- text-to-image
- diffusion
- cpu-optimized
- bytedream
- clip
pipeline_tag: text-to-image
---
# Byte Dream - Text-to-Image Model
## Overview
Byte Dream is a production-ready text-to-image diffusion model optimized for CPU inference.
It uses CLIP ViT-B/32 for text encoding and a custom UNet architecture for image generation.
## Features
- βœ… **CPU Optimized**: Runs efficiently on CPU (no GPU required)
- βœ… **High Quality**: Generates 512x512 images
- βœ… **Fast Inference**: Optimized for speed
- βœ… **Easy to Use**: Simple Python API and web interface
- βœ… **Open Source**: MIT License
## Installation
```bash
pip install torch pillow transformers
git lfs install
git clone https://huggingface.co/Enzo8930302/ByteDream
cd ByteDream
```
## Usage
### Quick Start
```python
from bytedream import ByteDreamGenerator
# Load model
generator = ByteDreamGenerator(hf_repo_id="Enzo8930302/ByteDream")
# Generate image
image = generator.generate(
prompt="A beautiful sunset over mountains, digital art",
num_inference_steps=50,
guidance_scale=7.5,
)
image.save("output.png")
```
### Using Cloud API
```python
from bytedream import ByteDreamHFClient
client = ByteDreamHFClient(
repo_id="Enzo8930302/ByteDream",
use_api=True,
)
image = client.generate(
prompt="Futuristic city at night, cyberpunk",
)
image.save("output.png")
```
## Training
Train on your own dataset:
```bash
# Create dataset
python create_test_dataset.py
# Train model
python train.py --config config.yaml --train_data dataset
```
## Web Interface
Launch Gradio web interface:
```bash
python app.py
```
Or deploy to Hugging Face Spaces:
```bash
python deploy_to_spaces.py --repo_id YourUsername/ByteDream-Space
```
## Model Architecture
- **Text Encoder**: CLIP ViT-B/32 (512 dimensions)
- **UNet**: Custom architecture with cross-attention
- **VAE**: Autoencoder for latent space
- **Scheduler**: DDIM sampling
### Parameters
- Cross-attention dimension: 512
- Block channels: [128, 256, 512, 512]
- Attention heads: 4
- Layers per block: 1
## Examples
### Prompts that work well:
- "A serene lake at sunset with mountains"
- "Futuristic city with flying cars, cyberpunk"
- "Majestic dragon flying over castle, fantasy"
- "Peaceful garden with cherry blossoms"
### Tips:
- Use detailed, descriptive prompts
- Add style keywords (digital art, oil painting, etc.)
- Use negative prompts to avoid unwanted elements
- Higher guidance scale = more faithful to prompt
## Files Structure
```
ByteDream/
β”œβ”€β”€ bytedream/ # Core package
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ generator.py # Main generator
β”‚ β”œβ”€β”€ model.py # Model architecture
β”‚ β”œβ”€β”€ pipeline.py # Pipeline
β”‚ β”œβ”€β”€ scheduler.py # Scheduler
β”‚ β”œβ”€β”€ hf_api.py # HF API client
β”‚ └── utils.py
β”œβ”€β”€ train.py # Training script
β”œβ”€β”€ infer.py # Inference
β”œβ”€β”€ app.py # Web UI
β”œβ”€β”€ config.yaml # Config
└── requirements.txt # Dependencies
```
## Requirements
- Python 3.8+
- PyTorch
- Pillow
- Transformers
- Gradio (for web UI)
See `requirements.txt` for full list.
## License
MIT License
## Citation
```bibtex
@software{bytedream2024,
title={Byte Dream: CPU-Optimized Text-to-Image Generation},
year={2024}
}
```
## Links
- [GitHub](https://github.com/yourusername/bytedream)
- [Documentation](https://huggingface.co/Enzo8930302/ByteDream/blob/main/README.md)
- [Spaces Demo](https://huggingface.co/spaces/Enzo8930302/ByteDream-Space)
## Support
For issues or questions, please open an issue on GitHub.
---
**Created by Enzo and the Byte Dream Team** 🎨