bobarker33 commited on
Commit
df234c5
·
verified ·
1 Parent(s): ebf1139

Upload 4 files

Browse files
Files changed (4) hide show
  1. .gitattributes +1 -0
  2. README.md +111 -0
  3. model_index.json +32 -0
  4. scheduler/scheduler_config.json +6 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ images/FictionalChromaBanner_1.png filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,111 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: text-to-image
4
+ ---
5
+ # Chroma1-Base
6
+
7
+ Chroma1-Base is an **8.9B** parameter text-to-image foundational model based on **FLUX.1-schnell**. It is fully **Apache 2.0 licensed**, ensuring that anyone can use, modify, and build upon it.
8
+
9
+ As a **base model**, Chroma1 is intentionally designed to be an excellent starting point for **finetuning**. It provides a strong, neutral foundation for developers, researchers, and artists to create specialized models.
10
+
11
+ for the fast CFG "baked" version please go to [Chroma1-Flash](https://huggingface.co/lodestones/Chroma1-Flash).
12
+
13
+ ### Key Features
14
+ * **High-Performance Base:** 8.9B parameters, built on the powerful FLUX.1 architecture.
15
+ * **Easily Finetunable:** Designed as an ideal checkpoint for creating custom, specialized models.
16
+ * **Community-Driven & Open-Source:** Fully transparent with an Apache 2.0 license, and training history.
17
+ * **Flexible by Design:** Provides a flexible foundation for a wide range of generative tasks.
18
+
19
+ ## Special Thanks
20
+ A massive thank you to our supporters who make this project possible.
21
+ * **Anonymous donor** whose incredible generosity funded the pretraining run and data collections. Your support has been transformative for open-source AI.
22
+ * **Fictional.ai** for their fantastic support and for helping push the boundaries of open-source AI. You can try Chroma on their platform:
23
+
24
+ [![FictionalChromaBanner_1.png](./images/FictionalChromaBanner_1.png)](https://fictional.ai/?ref=chroma_hf)
25
+
26
+ ## How to Use
27
+
28
+ ### `diffusers` Library
29
+
30
+ install the requirements
31
+
32
+ `pip install transformers diffusers sentencepiece accelerate`
33
+
34
+ ```python
35
+ import torch
36
+ from diffusers import ChromaPipeline
37
+
38
+ pipe = ChromaPipeline.from_pretrained("lodestones/Chroma1-Base", torch_dtype=torch.bfloat16)
39
+ pipe.enable_model_cpu_offload()
40
+
41
+ prompt = [
42
+ "A high-fashion close-up portrait of a blonde woman in clear sunglasses. The image uses a bold teal and red color split for dramatic lighting. The background is a simple teal-green. The photo is sharp and well-composed, and is designed for viewing with anaglyph 3D glasses for optimal effect. It looks professionally done."
43
+ ]
44
+ negative_prompt = ["low quality, ugly, unfinished, out of focus, deformed, disfigure, blurry, smudged, restricted palette, flat colors"]
45
+
46
+ image = pipe(
47
+ prompt=prompt,
48
+ negative_prompt=negative_prompt,
49
+ generator=torch.Generator("cpu").manual_seed(433),
50
+ num_inference_steps=40,
51
+ guidance_scale=3.0,
52
+ num_images_per_prompt=1,
53
+ ).images[0]
54
+ image.save("chroma.png")
55
+ ```
56
+ ComfyUI
57
+ For advanced users and customized workflows, you can use Chroma with ComfyUI.
58
+
59
+ **Requirements:**
60
+ * A working ComfyUI installation.
61
+ * [Chroma checkpoint](https://huggingface.co/lodestones/Chroma) (latest version).
62
+ * [T5 XXL Text Encoder](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp16.safetensors).
63
+ * [FLUX VAE](https://huggingface.co/lodestones/Chroma/resolve/main/ae.safetensors).
64
+ * [Chroma Workflow JSON](https://huggingface.co/lodestones/Chroma/resolve/main/ChromaSimpleWorkflow20250507.json).
65
+
66
+ **Setup:**
67
+ 1. Place the `T5_xxl` model in your `ComfyUI/models/clip` folder.
68
+ 2. Place the `FLUX VAE` in your `ComfyUI/models/vae` folder.
69
+ 3. Place the `Chroma checkpoint` in your `ComfyUI/models/diffusion_models` folder.
70
+ 4. Load the Chroma workflow file into ComfyUI and run.
71
+
72
+ ## Model Details
73
+ * **Architecture:** Based on the 8.9B parameter FLUX.1-schnell model.
74
+ * **Training Data:** Trained on a 5M sample dataset curated from a 20M pool, including artistic, photographic, and niche styles.
75
+ * **Technical Report:** A comprehensive technical paper detailing the architectural modifications and training process is forthcoming.
76
+
77
+ ## Intended Use
78
+ Chroma is intended to be used as a **base model** for researchers and developers to build upon. It is ideal for:
79
+ * Finetuning on specific styles, concepts, or characters.
80
+ * Research into generative model behavior, alignment, and safety.
81
+ * As a foundational component in larger AI systems.
82
+
83
+ ## Limitations and Bias Statement
84
+ Chroma is trained on a broad, filtered dataset from the internet. As such, it may reflect the biases and stereotypes present in its training data. The model is released in a state as is and has not been aligned with a specific safety filter.
85
+
86
+ Users are responsible for their own use of this model. It has the potential to generate content that may be considered harmful, explicit, or offensive. I encourage developers to implement appropriate safeguards and ethical considerations in their downstream applications.
87
+
88
+ ## Summary of Architectural Modifications
89
+ *(For a full breakdown, tech report soon-ish.)*
90
+
91
+ * **12B → 8.9B Parameters:**
92
+ * **TL;DR:** I replaced a 3.3B parameter timestep-encoding layer with a more efficient 250M parameter FFN, as the original was vastly oversized for its task.
93
+ * **MMDiT Masking:**
94
+ * **TL;DR:** Masking T5 padding tokens enhanced fidelity and increased training stability by preventing the model from focusing on irrelevant `<pad>` tokens.
95
+ * **Custom Timestep Distributions:**
96
+ * **TL;DR:** I implemented a custom timestep sampling distribution (`-x^2`) to prevent loss spikes and ensure the model trains effectively on both high-noise and low-noise regions.
97
+
98
+ ## P.S
99
+ Chroma1-Base is Chroma-v.48
100
+
101
+ ## Citation
102
+ ```
103
+ @misc{rock2025chroma,
104
+ author = {Lodestone Rock},
105
+ title = {Chroma1-Base},
106
+ year = {2025},
107
+ publisher = {Hugging Face},
108
+ journal = {Hugging Face repository},
109
+ howpublished = {\url{https://huggingface.co/lodestones/Chroma1-Base}},
110
+ }
111
+ ```
model_index.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_class_name": "ChromaPipeline",
3
+ "_diffusers_version": "0.34.0",
4
+ "feature_extractor": [
5
+ null,
6
+ null
7
+ ],
8
+ "image_encoder": [
9
+ null,
10
+ null
11
+ ],
12
+ "scheduler": [
13
+ "diffusers",
14
+ "FlowMatchEulerDiscreteScheduler"
15
+ ],
16
+ "text_encoder": [
17
+ "transformers",
18
+ "T5EncoderModel"
19
+ ],
20
+ "tokenizer": [
21
+ "transformers",
22
+ "T5Tokenizer"
23
+ ],
24
+ "transformer": [
25
+ "diffusers",
26
+ "ChromaTransformer2DModel"
27
+ ],
28
+ "vae": [
29
+ "diffusers",
30
+ "AutoencoderKL"
31
+ ]
32
+ }
scheduler/scheduler_config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "_class_name": "FlowMatchEulerDiscreteScheduler",
3
+ "_diffusers_version": "0.34.0",
4
+ "num_train_timesteps": 1000,
5
+ "use_beta_sigmas": true
6
+ }