Safetensors
File size: 1,232 Bytes
10adc00
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2ba9420
d86c61e
10adc00
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
---
license: other
license_name: nvidia-open-model-license
license_link: >-
  https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license
base_model:
- nebulette/cozyberry-g4-vision
---

C/B-SIDE

A diffusion model with BERT. It's backward compatible with the T5 tokenizer.

Spatial encoding loss was calculated as [it was explained elsewhere](https://huggingface.co/nebulette/fashion-side).

![](images/couple.png)

Cozyberry was chosen as the only text encoder. There are no adapters.

As in the [waifu diffusion](https://ruwwww.github.io/al-folio/blog/2026/waifu-diffusion/), the image output alignment requires 10-100x less VRAM, due to the use of random patch cropping during training.

![](images/crop.png)

History

- After evaluating different text encoders, the final lightweight [BERT model](https://huggingface.co/nebulette/rnberry) was born
- Later on, thousands of classes from the danbooru 2025-26 were extracted, and the model learned from both [the textual and visual clues](https://huggingface.co/nebulette/cozyberry-g4-vision)
- In this release, horizontal scenes were further reinforced exclusively for the BERT model

Source data

- synthetic booru character fashion
- horizontal scenes