--- license: mit library_name: sd15-flow-trainer tags: - geometric-deep-learning - stable-diffusion - ksimplex - pentachoron - flow-matching - cross-attention-prior base_model: sd-legacy/stable-diffusion-v1-5 pipeline_tag: text-to-image --- # V1 weights test push https://github.com/AbstractEyes/sd15-flow-trainer https://huggingface.co/AbstractPhil/sd15-rectified-geometric-matching/blob/main/colab_trainer.py Step 0 ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/L8W0i4EzWc2XKKU3YImzp.png) Step 500 ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/cexS-dFaojxUebsW1KqR4.png) Step 1000 ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/wFZsCCiTawdTMQZKtEs9x.png) Step 1500 ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/9polT0bIUJ5ypM4PJLsOk.png) # KSimplex Geometric Attention Prior Geometric cross-attention prior for SD1.5 using pentachoron (4-simplex) structures. ## Architecture | Component | Params | |-----------|--------| | SD1.5 UNet (frozen) | 859,520,964 | | **Geo prior (trained)** | **4,845,725** | The geometric prior modulates CLIP encoder hidden states through 4-layer stacked k-simplex attention before they reach the 16 cross-attention blocks in the UNet. ## Simplex Configuration | Parameter | Value | |-----------|-------| | k (simplex dim) | 4 | | Embedding dim | 32 | | Feature dim | 768 | | Stacked layers | 4 | | Attention heads | 8 | | Base deformation | 0.25 | | Residual blend | learnable | | Timestep conditioned | True | ## Usage ```python from sd15_trainer_geo.pipeline import load_pipeline, load_geo_from_hub # Load base SD1.5 + fresh geo prior pipe = load_pipeline() # Load trained geo weights from this repo load_geo_from_hub(pipe, "AbstractPhil/sd15-rectified-geometric-matching") # Or one-shot: load base + geo in one call pipe = load_pipeline(geo_repo_id="AbstractPhil/sd15-rectified-geometric-matching") ``` ## Training Info - **dataset**: AbstractPhil/imagenet-synthetic (flux_schnell_512) - **samples**: 10000 - **epochs**: 1 - **shift**: 2.5 - **base_lr**: 0.0001 - **min_snr_gamma**: 5.0 - **cfg_dropout**: 0.1 - **batch_size**: 6 - **loss_final**: 0.3784324672818184 ## Post Analysis ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/qs2vvoY7f9HdfYYuGI-k5.png) ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/5MpURgWYrFmxpZf8KPQWG.png) ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/ofgomH4SkBbyAtcQDeLBn.png) ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/hxA8Q0rsm6wYpQDgB4puQ.png) ## License MIT — [AbstractPhil](https://huggingface.co/AbstractPhil)