File size: 3,922 Bytes
4bd664b
 
 
 
 
 
 
 
 
 
 
 
 
80b58c8
 
4bd664b
 
80b58c8
 
4bd664b
 
 
 
 
80b58c8
 
 
 
4bd664b
 
 
 
80b58c8
 
 
 
4bd664b
80b58c8
 
 
4bd664b
 
80b58c8
4bd664b
80b58c8
 
 
4bd664b
80b58c8
 
 
 
4bd664b
7218719
4bd664b
7218719
4bd664b
 
 
7218719
 
4bd664b
 
7218719
4bd664b
80b58c8
 
 
 
4bd664b
7218719
 
4bd664b
 
7218719
4bd664b
 
7218719
 
4bd664b
7218719
4bd664b
7218719
80b58c8
4bd664b
80b58c8
 
4bd664b
7218719
4bd664b
 
7218719
 
4bd664b
7218719
4bd664b
 
 
 
80b58c8
4bd664b
 
 
 
 
80b58c8
4bd664b
80b58c8
4bd664b
 
 
 
 
80b58c8
4bd664b
 
 
 
 
80b58c8
4bd664b
80b58c8
 
4bd664b
80b58c8
 
4bd664b
80b58c8
4bd664b
 
 
 
80b58c8
4bd664b
 
 
 
80b58c8
 
4bd664b
80b58c8
4bd664b
 
 
 
 
 
 
80b58c8
 
 
4bd664b
80b58c8
 
 
 
 
 
 
 
 
 
4bd664b
 
 
 
 
 
80b58c8
 
4bd664b
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
---

license: mit
language: en
tags:
  - text-to-image
  - diffusion
  - cpu-optimized
  - bytedream
  - clip
pipeline_tag: text-to-image
---


# Byte Dream - Text-to-Image Model

## Overview
Byte Dream is a production-ready text-to-image diffusion model optimized for CPU inference. 
It uses CLIP ViT-B/32 for text encoding and a custom UNet architecture for image generation.

## Features
- βœ… **CPU Optimized**: Runs efficiently on CPU (no GPU required)
- βœ… **High Quality**: Generates 512x512 images
- βœ… **Fast Inference**: Optimized for speed
- βœ… **Easy to Use**: Simple Python API and web interface
- βœ… **Open Source**: MIT License

## Installation

```bash

pip install torch pillow transformers

git lfs install

git clone https://huggingface.co/Enzo8930302/ByteDream

cd ByteDream

```

## Usage

### Quick Start
```python

from bytedream import ByteDreamGenerator



# Load model

generator = ByteDreamGenerator(hf_repo_id="Enzo8930302/ByteDream")



# Generate image

image = generator.generate(

    prompt="A beautiful sunset over mountains, digital art",

    num_inference_steps=50,

    guidance_scale=7.5,

)

image.save("output.png")

```

### Using Cloud API
```python

from bytedream import ByteDreamHFClient



client = ByteDreamHFClient(

    repo_id="Enzo8930302/ByteDream",

    use_api=True,

)



image = client.generate(

    prompt="Futuristic city at night, cyberpunk",

)

image.save("output.png")

```

## Training

Train on your own dataset:

```bash

# Create dataset

python create_test_dataset.py



# Train model

python train.py --config config.yaml --train_data dataset

```

## Web Interface

Launch Gradio web interface:

```bash

python app.py

```

Or deploy to Hugging Face Spaces:

```bash

python deploy_to_spaces.py --repo_id YourUsername/ByteDream-Space

```

## Model Architecture

- **Text Encoder**: CLIP ViT-B/32 (512 dimensions)
- **UNet**: Custom architecture with cross-attention
- **VAE**: Autoencoder for latent space
- **Scheduler**: DDIM sampling

### Parameters
- Cross-attention dimension: 512
- Block channels: [128, 256, 512, 512]
- Attention heads: 4
- Layers per block: 1

## Examples

### Prompts that work well:
- "A serene lake at sunset with mountains"
- "Futuristic city with flying cars, cyberpunk"
- "Majestic dragon flying over castle, fantasy"
- "Peaceful garden with cherry blossoms"

### Tips:
- Use detailed, descriptive prompts
- Add style keywords (digital art, oil painting, etc.)
- Use negative prompts to avoid unwanted elements
- Higher guidance scale = more faithful to prompt

## Files Structure

```

ByteDream/

β”œβ”€β”€ bytedream/          # Core package

β”‚   β”œβ”€β”€ __init__.py

β”‚   β”œβ”€β”€ generator.py    # Main generator

β”‚   β”œβ”€β”€ model.py        # Model architecture

β”‚   β”œβ”€β”€ pipeline.py     # Pipeline

β”‚   β”œβ”€β”€ scheduler.py    # Scheduler

β”‚   β”œβ”€β”€ hf_api.py       # HF API client

β”‚   └── utils.py

β”œβ”€β”€ train.py            # Training script

β”œβ”€β”€ infer.py            # Inference

β”œβ”€β”€ app.py              # Web UI

β”œβ”€β”€ config.yaml         # Config

└── requirements.txt    # Dependencies

```

## Requirements

- Python 3.8+
- PyTorch
- Pillow
- Transformers
- Gradio (for web UI)

See `requirements.txt` for full list.

## License

MIT License

## Citation

```bibtex

@software{bytedream2024,

  title={Byte Dream: CPU-Optimized Text-to-Image Generation},

  year={2024}

}

```

## Links

- [GitHub](https://github.com/yourusername/bytedream)
- [Documentation](https://huggingface.co/Enzo8930302/ByteDream/blob/main/README.md)
- [Spaces Demo](https://huggingface.co/spaces/Enzo8930302/ByteDream-Space)

## Support

For issues or questions, please open an issue on GitHub.

---

**Created by Enzo and the Byte Dream Team** 🎨