|
|
--- |
|
|
license: mit |
|
|
library_name: sd15-flow-trainer |
|
|
tags: |
|
|
- geometric-deep-learning |
|
|
- stable-diffusion |
|
|
- ksimplex |
|
|
- pentachoron |
|
|
- flow-matching |
|
|
- cross-attention-prior |
|
|
base_model: sd-legacy/stable-diffusion-v1-5 |
|
|
pipeline_tag: text-to-image |
|
|
--- |
|
|
|
|
|
# KSimplex Geometric Attention Prior |
|
|
|
|
|
Geometric cross-attention prior for SD1.5 using pentachoron (4-simplex) structures. |
|
|
|
|
|
# Before and After |
|
|
|
|
|
## Pretrain |
|
|
|
|
|
 |
|
|
 |
|
|
|
|
|
## Final |
|
|
|
|
|
 |
|
|
 |
|
|
|
|
|
## Architecture |
|
|
|
|
|
| Component | Params | |
|
|
|-----------|--------| |
|
|
| SD1.5 UNet (frozen) | 859,520,964 | |
|
|
| **Geo prior (trained)** | **4,845,725** | |
|
|
|
|
|
The geometric prior modulates CLIP encoder hidden states through |
|
|
4-layer stacked k-simplex attention before they reach |
|
|
the 16 cross-attention blocks in the UNet. |
|
|
|
|
|
## Simplex Configuration |
|
|
|
|
|
| Parameter | Value | |
|
|
|-----------|-------| |
|
|
| k (simplex dim) | 4 | |
|
|
| Embedding dim | 32 | |
|
|
| Feature dim | 768 | |
|
|
| Stacked layers | 4 | |
|
|
| Attention heads | 8 | |
|
|
| Base deformation | 0.25 | |
|
|
| Residual blend | learnable | |
|
|
| Timestep conditioned | True | |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from sd15_trainer_geo.pipeline import load_pipeline, load_geo_from_hub |
|
|
|
|
|
# Load base SD1.5 + fresh geo prior |
|
|
pipe = load_pipeline() |
|
|
|
|
|
# Load trained geo weights from this repo |
|
|
load_geo_from_hub(pipe, "AbstractPhil/sd15-geoflow-characters") |
|
|
|
|
|
# Or one-shot: load base + geo in one call |
|
|
pipe = load_pipeline(geo_repo_id="AbstractPhil/sd15-geoflow-characters") |
|
|
``` |
|
|
|
|
|
## Training Info |
|
|
|
|
|
- **dataset**: AbstractPhil/synthetic-characters (schnell_simple_2) |
|
|
- **samples**: 50000 |
|
|
- **epochs**: 1 |
|
|
- **steps**: 8333 |
|
|
- **shift**: 2.5 |
|
|
- **base_lr**: 5e-05 |
|
|
- **min_snr_gamma**: 5.0 |
|
|
- **cfg_dropout**: 0.1 |
|
|
- **batch_size**: 6 |
|
|
- **geo_loss_weight**: 0.01 |
|
|
- **loss_final**: 0.3177722838521004 |
|
|
|
|
|
|
|
|
# Assessment |
|
|
|
|
|
 |
|
|
|
|
|
 |
|
|
|
|
|
 |
|
|
|
|
|
 |
|
|
|
|
|
 |
|
|
|
|
|
 |
|
|
|
|
|
 |
|
|
|
|
|
 |
|
|
|
|
|
 |
|
|
|
|
|
 |
|
|
|
|
|
 |
|
|
|
|
|
 |
|
|
|
|
|
 |
|
|
|
|
|
## License |
|
|
|
|
|
MIT — [AbstractPhil](https://huggingface.co/AbstractPhil) |
|
|
|