File size: 3,922 Bytes

4bd664b
 
 
 
 
 
 
 
 
 
 
 
 
80b58c8
 
4bd664b
 
80b58c8
 
4bd664b
 
 
 
 
80b58c8
 
 
 
4bd664b
 
 
 
80b58c8
 
 
 
4bd664b
80b58c8
 
 
4bd664b
 
80b58c8
4bd664b
80b58c8
 
 
4bd664b
80b58c8
 
 
 
4bd664b
7218719
4bd664b
7218719
4bd664b
 
 
7218719
 
4bd664b
 
7218719
4bd664b
80b58c8
 
 
 
4bd664b
7218719
 
4bd664b
 
7218719
4bd664b
 
7218719
 
4bd664b
7218719
4bd664b
7218719
80b58c8
4bd664b
80b58c8
 
4bd664b
7218719
4bd664b
 
7218719
 
4bd664b
7218719
4bd664b
 
 
 
80b58c8
4bd664b
 
 
 
 
80b58c8
4bd664b
80b58c8
4bd664b
 
 
 
 
80b58c8
4bd664b
 
 
 
 
80b58c8
4bd664b
80b58c8
 
4bd664b
80b58c8
 
4bd664b
80b58c8
4bd664b
 
 
 
80b58c8
4bd664b
 
 
 
80b58c8
 
4bd664b
80b58c8
4bd664b
 
 
 
 
 
 
80b58c8
 
 
4bd664b
80b58c8
 
 
 
 
 
 
 
 
 
4bd664b
 
 
 
 
 
80b58c8
 
4bd664b

---

license: mit
language: en
tags:
  - text-to-image
  - diffusion
  - cpu-optimized
  - bytedream
  - clip
pipeline_tag: text-to-image
---


# Byte Dream - Text-to-Image Model

## Overview
Byte Dream is a production-ready text-to-image diffusion model optimized for CPU inference. 
It uses CLIP ViT-B/32 for text encoding and a custom UNet architecture for image generation.

## Features
- ✅ **CPU Optimized**: Runs efficiently on CPU (no GPU required)
- ✅ **High Quality**: Generates 512x512 images
- ✅ **Fast Inference**: Optimized for speed
- ✅ **Easy to Use**: Simple Python API and web interface
- ✅ **Open Source**: MIT License

## Installation

```bash

pip install torch pillow transformers

git lfs install

git clone https://huggingface.co/Enzo8930302/ByteDream

cd ByteDream

```

## Usage

### Quick Start
```python

from bytedream import ByteDreamGenerator



# Load model

generator = ByteDreamGenerator(hf_repo_id="Enzo8930302/ByteDream")



# Generate image

image = generator.generate(

    prompt="A beautiful sunset over mountains, digital art",

    num_inference_steps=50,

    guidance_scale=7.5,

)

image.save("output.png")

```

### Using Cloud API
```python

from bytedream import ByteDreamHFClient



client = ByteDreamHFClient(

    repo_id="Enzo8930302/ByteDream",

    use_api=True,

)



image = client.generate(

    prompt="Futuristic city at night, cyberpunk",

)

image.save("output.png")

```

## Training

Train on your own dataset:

```bash

# Create dataset

python create_test_dataset.py



# Train model

python train.py --config config.yaml --train_data dataset

```

## Web Interface

Launch Gradio web interface:

```bash

python app.py

```

Or deploy to Hugging Face Spaces:

```bash

python deploy_to_spaces.py --repo_id YourUsername/ByteDream-Space

```

## Model Architecture

- **Text Encoder**: CLIP ViT-B/32 (512 dimensions)
- **UNet**: Custom architecture with cross-attention
- **VAE**: Autoencoder for latent space
- **Scheduler**: DDIM sampling

### Parameters
- Cross-attention dimension: 512
- Block channels: [128, 256, 512, 512]
- Attention heads: 4
- Layers per block: 1

## Examples

### Prompts that work well:
- "A serene lake at sunset with mountains"
- "Futuristic city with flying cars, cyberpunk"
- "Majestic dragon flying over castle, fantasy"
- "Peaceful garden with cherry blossoms"

### Tips:
- Use detailed, descriptive prompts
- Add style keywords (digital art, oil painting, etc.)
- Use negative prompts to avoid unwanted elements
- Higher guidance scale = more faithful to prompt

## Files Structure

```

ByteDream/

├── bytedream/          # Core package

│   ├── __init__.py

│   ├── generator.py    # Main generator

│   ├── model.py        # Model architecture

│   ├── pipeline.py     # Pipeline

│   ├── scheduler.py    # Scheduler

│   ├── hf_api.py       # HF API client

│   └── utils.py

├── train.py            # Training script

├── infer.py            # Inference

├── app.py              # Web UI

├── config.yaml         # Config

└── requirements.txt    # Dependencies

```

## Requirements

- Python 3.8+
- PyTorch
- Pillow
- Transformers
- Gradio (for web UI)

See `requirements.txt` for full list.

## License

MIT License

## Citation

```bibtex

@software{bytedream2024,

  title={Byte Dream: CPU-Optimized Text-to-Image Generation},

  year={2024}

}

```

## Links

- [GitHub](https://github.com/yourusername/bytedream)
- [Documentation](https://huggingface.co/Enzo8930302/ByteDream/blob/main/README.md)
- [Spaces Demo](https://huggingface.co/spaces/Enzo8930302/ByteDream-Space)

## Support

For issues or questions, please open an issue on GitHub.

---

**Created by Enzo and the Byte Dream Team** 🎨