| --- |
| license: other |
| license_name: nvidia-open-model-license |
| license_link: >- |
| https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license |
| base_model: |
| - nebulette/cozyberry-g4-vision |
| --- |
| |
| C/B-SIDE |
|
|
| A diffusion model with BERT. It's backward compatible with the T5 tokenizer. |
|
|
| Spatial encoding loss was calculated as [it was explained elsewhere](https://huggingface.co/nebulette/fashion-side). |
|
|
|  |
|
|
| Cozyberry was chosen as the only text encoder. There are no adapters. |
|
|
| As in the [waifu diffusion](https://ruwwww.github.io/al-folio/blog/2026/waifu-diffusion/), the image output alignment requires 10-100x less VRAM, due to the use of random patch cropping during training. |
|
|
|  |
|
|
| History |
|
|
| - After evaluating different text encoders, the final lightweight [BERT model](https://huggingface.co/nebulette/rnberry) was born |
| - Later on, thousands of classes from the danbooru 2025-26 were extracted, and the model learned from both [the textual and visual clues](https://huggingface.co/nebulette/cozyberry-g4-vision) |
| - In this release, horizontal scenes were further reinforced exclusively for the BERT model |
|
|
| Source data |
|
|
| - synthetic booru character fashion |
| - horizontal scenes |