honghong3's picture
Upload README.md with huggingface_hub
6a41d49 verified
|
Raw
History Blame Contribute Delete
2.48 kB
---
license: cc-by-nc-4.0
language:
- en
tags:
- diffusion
- anime
- image-generation
- dit
- flow-matching
pipeline_tag: text-to-image
---
# Diffusion Transformer
A flow matching-based diffusion transformer for anime image generation.
This project is for **research purposes only**.
## Links
- GitHub: https://github.com/FREEANIMA/diffusion_model_sampling
- Hugging Face: https://huggingface.co/honghong3/diffusion-transformer
## License
This project is licensed under [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/).
For research and non-commercial use only.
## Training Environment
- **GPU**: NVIDIA A100 40GB (Google Colab)
- **Dataset**: ~4.8M anime images
- **Processed**: ~1.8M images (epoch 0, ongoing)
- **Throughput**: ~1.3 it/s
- Samples below are intermediate checkpoints β€” quality will improve as training continues.
## Training & Samples
| 12k images | 600k images | 1.2M images | 1.8M images |
|---|---|---|---|
| ![1k](assets/1k.png) | ![50k](assets/50k.png) | ![100k](assets/100k.png) | ![150k](assets/150k.png) |
```
# sampler conditional
prompt = "1girl, red hair, school uniform, happy, red eyes, open mouth, detailed face"
steps = 100
cfg_scale = 2.0
seed = 1234
```
## Model Architecture
- **Backbone**: Diffusion Transformer (DiT) with adaLN modulation
- **Parameters**: ~550M
- **Framework**: Flow Matching (velocity prediction)
![architecture](assets/model.png)
## Components
| Component | Model |
|---|---|
| VAE | stabilityai/sd-vae-ft-mse |
| Text Encoder | openai/clip-vit-large-patch14 |
| Tokenizer | openai/clip-vit-large-patch14 |
## Sampler Details
- **Resolution**: 512 Γ— 512 (single bucket)
- **Noise Schedule**: Log-SNR uniform sampling with resolution-dependent shift
- **CFG**: Classifier-free guidance
- Prompts are **tag-based** (comma-separated danbooru-style tags)
## Requirements
```bash
pip install torch transformers diffusers accelerate torchvision tqdm
```
## Usage
```bash
python main.py
```
```
C:.
β”‚ main.py
β”‚ output.png
β”‚ README.md
β”‚ requirements.txt
β”‚
β”œβ”€app
β”‚ β”‚ clip.py
β”‚ β”‚ config.json
β”‚ β”‚ config.py
β”‚ β”‚ model.py
β”‚ β”‚ sampling.py
β”‚ β”‚ sd_vae.py
β”‚ └─ __init__.py
β”‚
β”œβ”€assets
β”‚ 100k.png
β”‚ 150k.png
β”‚ 1k.png
β”‚ 50k.png
β”‚
└─weights
image.pth
```