Text-to-Image
diffusion
safety
dose-response
felfri commited on
Commit
950e259
·
verified ·
1 Parent(s): ccb39cf

Upload config.yaml with huggingface_hub

Browse files
Files changed (1) hide show
  1. config.yaml +602 -0
config.yaml ADDED
@@ -0,0 +1,602 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ diffusion_model:
2
+ _model_class: PRX
3
+ in_channels: 3
4
+ patch_size: 32
5
+ context_in_dim: 2304
6
+ hidden_size: 1792
7
+ mlp_ratio: 3.5
8
+ num_heads: 28
9
+ depth: 16
10
+ axes_dim:
11
+ - 32
12
+ - 32
13
+ theta: 10000
14
+ time_factor: 1000.0
15
+ time_max_period: 10000
16
+ conditioning_block_ids: null
17
+ bottleneck_size: 256
18
+ diffusion_text_tower:
19
+ preset_name: t5gemma2b-256-bf16
20
+ model_name: google/t5gemma-2b-2b-ul2
21
+ prompt_max_tokens: 256
22
+ use_attn_mask: true
23
+ use_last_hidden_state: true
24
+ only_tokenizer: false
25
+ torch_dtype: torch.bfloat16
26
+ unpadded: false
27
+ diffusion_vae:
28
+ model_name: identity
29
+ model_class: IdentityVAE
30
+ default_channels: 3
31
+ torch_dtype: torch.bfloat16
32
+ diffusion_scheduler:
33
+ prediction_type: x_prediction_flow_matching
34
+ num_train_timesteps: 1000
35
+ timestep_shift: 3.0
36
+ denoiser_dtype: torch.float
37
+ optimizer:
38
+ _target_: prx.training.optimizer.create_muon_optimizer
39
+ _recursive_: false
40
+ muon_name_filter: blocks
41
+ muon_config:
42
+ lr: 0.0001
43
+ momentum: 0.95
44
+ nesterov: true
45
+ ns_steps: 5
46
+ rms_scale: true
47
+ weight_decay: 0.0
48
+ adam_config:
49
+ lr: 0.0001
50
+ betas:
51
+ - 0.9
52
+ - 0.95
53
+ eps: 1.0e-08
54
+ weight_decay: 0.0
55
+ dataset:
56
+ train_dataset:
57
+ _target_: prx.dataset.StreamingProcessedDataset
58
+ local:
59
+ - /checkpoint/dream/felixfriedrich/diffusion_safety/dose_response/mds/safe_c4
60
+ - /checkpoint/dream/felixfriedrich/diffusion_safety/dose_response/mds/unsafe_c4
61
+ caption_keys:
62
+ - - prompt
63
+ - 1.0
64
+ text_tower: t5gemma2b-256-bf16
65
+ prompt_max_tokens: 256
66
+ cache_limit: 8tb
67
+ download_timeout: 7200
68
+ drop_last: true
69
+ shuffle: true
70
+ batching_method: device_per_stream
71
+ prefetch_factor: 2
72
+ num_workers: 8
73
+ persistent_workers: true
74
+ pin_memory: true
75
+ has_mask_text_latents: false
76
+ has_text_latents: false
77
+ transforms:
78
+ - _target_: prx.dataset.transforms.ArAwareResize
79
+ default_image_size: 512
80
+ patch_size_pixels: 32
81
+ transforms_targets:
82
+ - image
83
+ shuffle_seed: 478163327
84
+ eval_dataset:
85
+ _target_: prx.dataset.StreamingProcessedDataset
86
+ local:
87
+ - /checkpoint/dream/felixfriedrich/diffusion_safety/dose_response/mds/safe_c4
88
+ - /checkpoint/dream/felixfriedrich/diffusion_safety/dose_response/mds/unsafe_c4
89
+ caption_keys:
90
+ - - prompt
91
+ - 1.0
92
+ text_tower: t5gemma2b-256-bf16
93
+ prompt_max_tokens: 256
94
+ cache_limit: 8tb
95
+ download_timeout: 7200
96
+ drop_last: true
97
+ shuffle: true
98
+ batching_method: device_per_stream
99
+ prefetch_factor: 2
100
+ num_workers: 8
101
+ persistent_workers: true
102
+ pin_memory: true
103
+ has_mask_text_latents: false
104
+ has_text_latents: false
105
+ transforms:
106
+ - _target_: prx.dataset.transforms.ArAwareResize
107
+ default_image_size: 512
108
+ patch_size_pixels: 32
109
+ transforms_targets:
110
+ - image
111
+ shuffle_seed: 107420369
112
+ callbacks:
113
+ image_monitor:
114
+ prompt:
115
+ - A curious tabby cat perched on a moss-covered log in a lush, misty forest at
116
+ dawn. Sunbeams filter through the canopy, illuminating dewdrops on ferns and
117
+ wildflowers surrounding the feline.
118
+ - A majestic male lion with a flowing golden mane leaping gracefully over a rushing
119
+ river in the African savanna. The water sparkles in the afternoon sun, with
120
+ a vibrant rainbow arching across the sky in the background.
121
+ - A breathtaking twilight view of the Eiffel Tower, its intricate iron lattice
122
+ illuminated against a purple and orange sky. The Champ de Mars stretches below,
123
+ dotted with twinkling lights and couples strolling hand in hand.
124
+ - The opulent Hall of Mirrors inside the Palace of Versailles, bathed in warm
125
+ golden light. Crystal chandeliers reflect in the polished marble floor, while
126
+ ornate gilded frames and frescoed ceilings showcase 18th-century artistry at
127
+ its finest.
128
+ - The magnificent glass dome of the Paris Grand Palais glowing ethereally at dusk.
129
+ The Beaux-Arts architecture is accentuated by dramatic lighting, with the Seine
130
+ River flowing peacefully in the foreground.
131
+ - The Arc de Triomphe standing proudly at the center of Place Charles de Gaulle,
132
+ illuminated by the warm glow of street lamps. Streaks of car lights circle the
133
+ monument, creating a dynamic long-exposure effect against the deep blue evening
134
+ sky.
135
+ - An exquisite crystal bottle of luxury perfume resting on a mirrored surface.
136
+ Soft, diffused lighting catches the facets of the glass, creating a sparkling
137
+ effect. A single orchid bloom and scattered rose petals add a touch of elegance
138
+ to the composition.
139
+ - A close-up portrait of a strikingly beautiful woman with piercing green eyes
140
+ and flawless skin. Soft, natural lighting enhances her features, while a gentle
141
+ breeze tousles her flowing chestnut hair. Her expression is both mysterious
142
+ and alluring.
143
+ - A carefree young child with tousled hair and rosy cheeks, laughing joyfully
144
+ while running through a sunlit meadow. Butterflies and soap bubbles float around
145
+ the child, adding to the sense of wonder and innocence.
146
+ - The skilled, flour-dusted hands of an artisan baker kneading a large ball of
147
+ dough on a rustic wooden table. Shafts of early morning light illuminate the
148
+ scene, highlighting the texture of the dough and the baker's strong, capable
149
+ fingers.
150
+ - The word "Photoroom" written in vibrant, multicolored neon letters against a
151
+ dark brick wall. The letters flicker and glow, casting a warm, inviting light
152
+ that reflects off nearby surfaces and creates an atmosphere of creativity and
153
+ energy.
154
+ - A sleek, modern logo for an AI company specializing in commerce photography.
155
+ The design incorporates a stylized camera lens seamlessly blended with a circuit
156
+ board pattern, symbolizing the fusion of technology and visual arts. The color
157
+ scheme features deep blues and silver, conveying trust and innovation.
158
+ - Photography of a powerful, full-maned lion in mid-leap, emerging from a large,
159
+ moss-covered stone in a moonlit savanna. The night sky is star-filled, with
160
+ a bright full moon casting a silvery glow on the scene. The lion's fur is detailed,
161
+ reflecting the moonlight, emphasizing its muscular build and focused expression
162
+ as it jumps.
163
+ - Professional photography of a domestic cat with sleek, shiny fur, sitting elegantly
164
+ amidst a dense forest setting. The forest is lush, with tall, sun-dappled trees
165
+ and a carpet of vibrant green ferns. The cat, with piercing green eyes, appears
166
+ alert and poised, its fur pattern blending harmoniously with the natural surroundings.
167
+ - The photo depicts an astronaut in full space gear, riding a horse across an
168
+ open field. The detailed space suit contrasts sharply with the natural surroundings,
169
+ while the horse gallops gracefully, its coat shining in the sunlight. This surreal
170
+ scene combines the cutting-edge realm of space exploration with the timeless
171
+ beauty of nature, creating a striking visual contrast.
172
+ - Photography of a small, cheerful cactus with a big, happy face, standing alone
173
+ in the vast Sahara desert. The cactus has bright green spikes and is wearing
174
+ a tiny sombrero. The desert around it is expansive, with rolling sand dunes
175
+ under a clear, blue sky, and the sun blazing down, casting sharp shadows on
176
+ the sand.
177
+ - Photo of a cute hedgehog and a shearwater bird, both donning festive Christmas
178
+ hats. They are surrounded by a snowy landscape with a backdrop of pine trees
179
+ lightly dusted with snow. The hedgehog's spines are covered in tiny snowflakes,
180
+ and the shearwater's feathers are ruffled, adding to the whimsical, festive
181
+ atmosphere.
182
+ - The image is a photography of a calm, serene dog in a meditative pose, sitting
183
+ on a lush green meadow. The dog has a peaceful expression, with its eyes gently
184
+ closed and paws placed together in a Zen-like posture. The surrounding meadow
185
+ is dotted with wildflowers and a gentle breeze ruffles the dog's fur, enhancing
186
+ the sense of tranquility.
187
+ - The photo showcases a beautiful, sparkling ring set against a festive Christmas
188
+ backdrop. The ring is placed on a soft, red velvet cushion with delicate snowflake
189
+ patterns embroidered on it. Surrounding the ring are pine cones, holly leaves,
190
+ and twinkling fairy lights, creating a warm and inviting Christmas atmosphere.
191
+ - The photo features an elegant bottle of red wine, standing on a polished marble
192
+ table. The marble has intricate veins of grey and white, and the wine bottle
193
+ is adorned with a sophisticated, vintage label. The background is softly blurred,
194
+ focusing attention on the reflective glass of the bottle and the rich, deep
195
+ color of the wine.
196
+ - Photography of a bustling city street at dusk. Neon signs illuminate the scene,
197
+ reflecting off the wet pavement. People are walking briskly, some holding umbrellas.
198
+ Tall buildings line the street, their windows glowing softly in the evening
199
+ light.
200
+ - Design photography of a scene set in a cozy mountain cabin. A roaring fireplace
201
+ casts a warm glow over the room, with a plush sofa and a knitted throw blanket
202
+ in the foreground. Through the window, snow-covered trees and a starry night
203
+ sky can be seen.
204
+ - A photo of a tranquil beach at sunrise. The sky is a mix of soft pinks and oranges,
205
+ and the gentle waves are lapping at the shore. A lone figure walks along the
206
+ water's edge, leaving footprints in the wet sand.
207
+ - The photography captures a snowy city park at night. Street lamps cast a soft
208
+ glow on the snow-covered paths and benches. Trees with bare branches are dusted
209
+ with snow, and the city skyline is visible in the distance.
210
+ - An old, cobblestone street in a European city. Colorful buildings with flower
211
+ boxes in the windows line the street. A bicycle is parked against a lamppost,
212
+ and a small café with outdoor seating can be seen in the corner.
213
+ - A photo of a spacious modern kitchen. The room is bathed in natural light from
214
+ large windows, highlighting the sleek marble countertops and stainless steel
215
+ appliances. A large island sits in the center, adorned with fresh fruits and
216
+ flowers.
217
+ - An image of a serene Japanese garden. A winding stone path leads through meticulously
218
+ manicured bushes and flowering plants, with a tranquil koi pond at its heart.
219
+ Traditional lanterns and a small wooden bridge enhance the peaceful ambiance.
220
+ - A photography taken in a vintage library with towering bookshelves filled to
221
+ the brim. A large globe and antique furniture are present, with a ladder on
222
+ wheels for reaching the higher shelves. Soft light filters through stained glass
223
+ windows, casting colorful patterns on the floor.
224
+ - A magazine photo of a monkey bathing in a hot spring in a snowstorm with steam
225
+ coming off the water.
226
+ - A highly detailed professional close-up photo of an animorphic Bengal tiger
227
+ wearing a white, ribbed tank top, sunglasses and headphones around his neck
228
+ as a DJ with its paws on the turntable on stage at an outdoor electronic dance
229
+ music concert in Ibiza at night; party atmosphere, wispy smoke with caustic
230
+ lighting.
231
+ - A white square on a black background, with a single black dot in the center.
232
+ The dot is perfectly round and sharply defined, contrasting starkly against
233
+ the white surface. The image is minimalistic, emphasizing the simplicity and
234
+ clarity of the composition.
235
+ - This is a digital painting depicting two figures, seemingly conjoined, their
236
+ faces obscured by textured, decaying wrappings. The style is dark, surreal,
237
+ and evocative of gothic horror. The color palette is predominantly monochrome,
238
+ using shades of gray, black, and beige, with hints of dark brown. The background
239
+ is a textured beige canvas with darker, crackled areas, suggesting age and decay.
240
+ The figures' faces are partially visible, with dark, hollow eyes and somber
241
+ expressions. The wrappings are intricately detailed, with visible folds, cracks,
242
+ and drips of a dark substance, possibly resembling tears or blood. The lighting
243
+ is subdued and moody, casting shadows that enhance the figures' grim appearance.
244
+ The overall atmosphere is one of sorrow, mystery, and unease. The aesthetic
245
+ is gritty and realistic, yet with a surreal, almost dreamlike quality. The vibe
246
+ is dark, melancholic, and thought-provoking. The painting's texture is highly
247
+ visible, mimicking the rough texture of the wrappings and the canvas. There
248
+ is a signature in the bottom right corner, but the characters are illegible.
249
+ The image is a digital painting, not a photograph or collage, and contains no
250
+ synthetic elements beyond the digital creation process.
251
+ - A digital painting depicting a man sitting on a surfboard at the beach, looking
252
+ at his phone. The man wears a red shirt, green shorts, white headphones with
253
+ "AKG" written on them in a sans-serif font, and goggles. A woman is seen in
254
+ the background, partially submerged in the water. The ocean is a vibrant turquoise,
255
+ with white foamy waves. The sky is a clear, bright blue. The overall style is
256
+ reminiscent of a vintage surf poster, with a slightly distressed, textured effect
257
+ applied to the background, giving it a faded, retro look. The lighting is bright
258
+ and sunny, creating a warm, summery atmosphere. The color palette is predominantly
259
+ warm, with blues, greens, and reds dominating the scene. The aesthetic is a
260
+ blend of retro and contemporary, combining the classic imagery of surfing with
261
+ the modern element of technology. The vibe is relaxed yet stylish, capturing
262
+ a moment of leisure and connection. The image is a digital painting, not a photograph,
263
+ and there are no visible synthetic elements beyond the digital painting techniques
264
+ used to create the distressed texture and overall style.
265
+ - A photograph depicting the interior of a vintage bus at night. The image is
266
+ composed of a long shot, showcasing the entire bus's interior. The bus is adorned
267
+ with vibrant, multicolored advertisements and patterned upholstery. The lighting
268
+ is predominantly neon, creating a retro, cyberpunk aesthetic. The color palette
269
+ consists of deep purples, pinks, and blues, contrasted by the warm tones of
270
+ the seating and advertisements. The atmosphere is moody and atmospheric, with
271
+ a sense of quiet solitude. The style is reminiscent of 1980s synthwave or cyberpunk,
272
+ with a focus on vibrant colors and retro technology. The overall vibe is nostalgic
273
+ and futuristic. The advertisements feature various images and text, including
274
+ "CITY" in a bold, sans-serif font. The bus seats are upholstered in a rich,
275
+ tapestry-like fabric with intricate patterns. The screens display various advertisements
276
+ and images. The overall composition is symmetrical, with the seats and screens
277
+ mirroring each other. There are no apparent synthetic elements in the image.
278
+ The image is sharp and well-lit, with a focus on detail and texture.
279
+ - This is a digital painting or graphic, not a photograph. It depicts a whimsical,
280
+ fairytale-like street scene with a large, ornate wedding cake as the focal point.
281
+ The style is highly detailed and realistic, yet maintains a fantastical, dreamlike
282
+ quality. The color palette is warm and inviting, dominated by pastel shades
283
+ of pink, peach, and cream, contrasted with the deep browns and greens of the
284
+ architecture and foliage. The lighting is soft and diffused, creating a gentle,
285
+ romantic atmosphere. The scene is set in a cobblestone street lined with charming
286
+ shops and buildings, with flowers and greenery adorning the scene. The cake
287
+ is a two-tiered masterpiece, decorated with fresh berries and flowers, sitting
288
+ on an elegant cake stand. Surrounding the cake are various pastries and fruits
289
+ arranged on platters and bowls. The overall aesthetic is romantic, charming,
290
+ and slightly nostalgic, evoking a sense of warmth and celebration. The background
291
+ is slightly blurred, drawing attention to the cake and surrounding desserts
292
+ in the foreground. There is no text in the image. The image is composed using
293
+ digital painting techniques and likely incorporates synthetic elements to create
294
+ the fantastical setting and lighting effects. The vibe is cheerful, celebratory,
295
+ and romantic.
296
+ - This close-up photograph captures a meticulously plated dish of beef tenderloin,
297
+ presented on a sleek black plate. The tenderloin, sliced into bite-sized pieces,
298
+ is cooked to a rare to medium-rare perfection, showcasing a rich brown exterior
299
+ with a pinkish center. The beef is generously drizzled with a glossy, dark brown
300
+ sauce, possibly balsamic vinegar, which adds a sheen to the meat. Scattered
301
+ around the beef are small, vibrant cherry tomatoes, still attached to their
302
+ green stems, adding a pop of color and freshness to the dish. The plate is garnished
303
+ with a light sprinkling of white and pink salt, and a few green herbs, enhancing
304
+ both the visual appeal and flavor complexity. The overall presentation is elegant
305
+ and appetizing, with the dark hues of the beef and sauce contrasting beautifully
306
+ against the black plate.
307
+ - This is a digital painting or a heavily manipulated photograph, appearing as
308
+ a surreal portrait of a young woman. The composition is a close-up, focusing
309
+ on the face. The woman's face is partially obscured by fragmented, cracked,
310
+ light teal and off-white pieces resembling peeling paint or decaying skin. These
311
+ fragments are irregularly shaped and layered, creating a sense of depth and
312
+ texture. The woman's skin is subtly illuminated, with a warm, golden light highlighting
313
+ her features, particularly her lips and eyes. Her eyes are a striking light
314
+ blue, contrasting with the cool tones of the fragmented elements. The overall
315
+ color palette is muted, with teal, beige, and golden hues dominating. The atmosphere
316
+ is melancholic and mysterious, with a hint of ethereal beauty. The style is
317
+ surreal and painterly, blending realistic portraiture with abstract elements.
318
+ The vibe is introspective and unsettling, suggesting themes of vulnerability,
319
+ fragility, and hidden identity. The lighting is dramatic, with a chiaroscuro
320
+ effect emphasizing the texture and form of the fragmented elements. There is
321
+ no text in the image.
322
+ - In this vibrant outdoor photograph, a young couple, likely in their early 30s,
323
+ stands closely together, exuding happiness and warmth. The woman, positioned
324
+ on the left, has her arm affectionately draped around the man's neck. Both are
325
+ beaming with broad smiles, revealing their teeth. The man, with short brown
326
+ hair, is dressed in a black tank top, while the woman, with her brown hair pulled
327
+ back, sports small earrings. They both have tan skin, suggesting they have been
328
+ spending time outdoors. Behind them, a surfboard leans against a wall, hinting
329
+ at a beach setting. The background is slightly blurred, but one can make out
330
+ a building and a tree, adding to the relaxed, summery atmosphere. The couple's
331
+ joyful expressions and the casual beachside backdrop create a picturesque moment
332
+ of shared bliss.
333
+ - This is a digital painting, a graphic illustration, depicting a rusty, vintage
334
+ tram on a sandy beach. The composition is a medium shot, focusing on the tram
335
+ with the beach and a cityscape in the background. The style is reminiscent of
336
+ concept art or digital matte painting, with a painterly, slightly impressionistic
337
+ quality. The color palette is warm, with rusty reds and oranges on the tram
338
+ contrasting against the cool blues and greens of the ocean and sky. The lighting
339
+ is bright, suggesting a sunny day, with shadows cast by the tram and palm trees
340
+ on the sand. The atmosphere is serene yet slightly melancholic, evoking a sense
341
+ of nostalgia and abandonment. The overall aesthetic is whimsical and slightly
342
+ surreal, with a touch of magical realism. The vibe is peaceful and contemplative.
343
+ The sky is a vibrant blue with fluffy white clouds. The ocean is a turquoise
344
+ color with gentle waves. The city in the distance is a hazy silhouette. The
345
+ palm trees are lush and green. The tram is heavily weathered, with peeling paint
346
+ and graffiti. The tracks are rusty and worn. The sand is light beige, with shadows
347
+ from the tram and vegetation. There is no text in the image.
348
+ - A photograph depicts two Asian senior adults, a man and a woman, standing side-by-side,
349
+ reviewing paperwork and using a handheld device in a brightly lit, modern cafe
350
+ setting. The man, with short gray hair, wears a white long-sleeved shirt and
351
+ a denim apron. The woman, with short dark hair, wears a white long-sleeved shirt
352
+ and a denim apron. They are both smiling and appear to be collaborating. The
353
+ background features a light-colored wall, wooden shelves with various items,
354
+ and a partially visible laptop. The overall atmosphere is warm, friendly, and
355
+ professional. The lighting is soft and natural, enhancing the image's bright
356
+ and airy feel. The color palette is muted, with soft whites, grays, and blues
357
+ dominating. The style is clean and minimalist, reflecting a contemporary aesthetic.
358
+ The vibe is calm, collaborative, and business-oriented.
359
+ - A photograph depicts a rustic Christmas scene. A blurred golden reindeer stands
360
+ in the background, out of focus. In the foreground, a wooden star-shaped ornament
361
+ rests on a weathered wooden surface. The star is light beige, with the word
362
+ "xmas" carved into its center in a simple, sans-serif font. A red and white
363
+ gingham ribbon tied in a bow adorns the star, accented by a small wooden button.
364
+ The overall lighting is soft and diffused, creating a warm, nostalgic atmosphere.
365
+ The color palette is muted, with earthy tones and soft reds. The style is vintage
366
+ and charming, evoking a sense of cozy holiday tradition. The image's aesthetic
367
+ is minimalist and rustic, with a focus on texture and detail. The vibe is calm,
368
+ peaceful, and heartwarming.
369
+ - A photograph depicts a fluffy lop-eared rabbit sitting on a weathered wooden
370
+ surface outdoors. The rabbit is predominantly white with patches of light brown
371
+ and tan fur, particularly on its head and ears. Its ears droop noticeably, and
372
+ its fur appears soft and thick. The rabbit's eyes are dark and expressive. It
373
+ is positioned slightly off-center, facing towards the left of the frame. Behind
374
+ the rabbit, slightly out of focus, is a miniature dark red metal wheelbarrow.
375
+ A partially visible orange apple sits to the left of the rabbit. Fallen autumn
376
+ leaves, predominantly reddish-brown, are scattered around the rabbit and apple
377
+ on the wooden surface. The background is a blurred but visible expanse of green
378
+ grass, suggesting an outdoor setting. The lighting is soft and natural, likely
379
+ diffused daylight, casting no harsh shadows. The overall atmosphere is calm,
380
+ peaceful, and autumnal. The aesthetic is rustic and charming, with a focus on
381
+ the rabbit as the main subject. The color palette is muted and natural, consisting
382
+ mainly of whites, browns, oranges, and greens. The style is naturalistic and
383
+ straightforward, without any overt artistic manipulation. The vibe is gentle
384
+ and heartwarming.
385
+ - The image showcases a white and brown rabbit with droopy ears, sitting on a
386
+ wooden surface. Behind the rabbit, there's a miniature cart with a wheel. Adjacent
387
+ to the cart, there's an orange apple and some dried autumn leaves scattered
388
+ around. The backdrop consists of a blurred green field, suggesting an outdoor
389
+ setting during the fall season.
390
+ - A photograph depicts a young woman with dark brown hair styled in a loose braid,
391
+ wearing a floral headband and a flowing, pale pink and purple floral dress.
392
+ She sits on a plush, dark reddish-brown velvet couch draped with purple velvet
393
+ fabric. The background is a vibrant, retro-style wallpaper with large orange
394
+ and pink floral patterns on a dark brown base. The woman's hands rest gently
395
+ on a dark-colored pillow with a large floral print featuring pink and white
396
+ roses. The lighting is soft and diffused, creating a warm and intimate atmosphere.
397
+ The overall aesthetic is bohemian and romantic, with a vintage 70s vibe. The
398
+ colours are rich and saturated, with a focus on warm tones. The composition
399
+ is a close-up shot, focusing on the woman and her surroundings. The image has
400
+ a dreamy, slightly melancholic mood.
401
+ - A young woman with a braided hairstyle and a golden headband is seated against
402
+ a vibrant orange-red wallpaper with floral patterns. She wears a sleeveless
403
+ dress adorned with floral prints and is draped in a deep purple fabric. She
404
+ holds a floral-patterned pillow close to her and appears to be in a contemplative
405
+ mood.
406
+ - A photograph depicts a young woman with long brown hair wearing a floral dress
407
+ and beaded jewelry, standing in front of a vibrant red autumnal backdrop. The
408
+ woman is gently holding and examining dark berries from a vine. The dress is
409
+ black with red floral patterns, adorned with red and black beaded embellishments
410
+ on the sleeves and neckline. Her hair is styled with a red floral crown. The
411
+ background is a wall of red leaves, creating a striking contrast with the woman's
412
+ dark dress. The lighting is natural, with sunlight illuminating the scene, casting
413
+ a warm glow on the woman and the leaves. The overall aesthetic is romantic,
414
+ autumnal, and slightly mystical. The atmosphere is serene and peaceful. The
415
+ style is reminiscent of folk art or fairytale imagery. The vibe is dreamy and
416
+ evocative of autumnal beauty.
417
+ - A woman with dark hair and a floral headpiece stands amidst a backdrop of vibrant
418
+ red leaves. She wears a dress adorned with red and black patterns, and her fingers
419
+ delicately hold a cluster of red berries. The sunlight filters through, casting
420
+ a warm glow on her face and the surrounding foliage.
421
+ - 'A photograph depicts a mason jar filled with vibrant red tomato juice, garnished
422
+ with a sprig of fresh celery, sitting on a rustic wooden cutting board. The
423
+ background features blurred but visible ingredients: ripe red tomatoes, a red
424
+ bell pepper, yellow bell peppers, and fresh basil leaves, all arranged on a
425
+ wooden surface. The lighting is soft and natural, creating a warm and inviting
426
+ atmosphere. The overall aesthetic is rustic, wholesome, and healthy, with a
427
+ focus on natural food photography. The colours are rich and saturated, with
428
+ the red of the tomatoes and juice being the dominant hue, complemented by the
429
+ greens of the herbs and the yellows of the peppers. The style is simple and
430
+ straightforward, emphasizing the natural beauty of the ingredients. The vibe
431
+ is relaxed, comforting, and appealing to those interested in healthy eating
432
+ and fresh produce. There is no text in the image.'
433
+ - The image showcases a rustic wooden table setting with a glass jar filled with
434
+ a vibrant red juice or smoothie. The jar is adorned with fresh green parsley
435
+ leaves. Surrounding the jar are various fresh ingredients, including tomatoes,
436
+ bell peppers, and basil leaves. The backdrop is a wooden wall, adding to the
437
+ rustic ambiance.
438
+ _target_: prx.callbacks.LogDiffusionImages
439
+ size: 512
440
+ guidance_scale: 3.5
441
+ seed: 42
442
+ speed_monitor:
443
+ _target_: composer.callbacks.speed_monitor.SpeedMonitor
444
+ window_size: 10
445
+ lr_monitor:
446
+ _target_: composer.callbacks.lr_monitor.LRMonitor
447
+ memory_monitor:
448
+ _target_: composer.callbacks.memory_monitor.MemoryMonitor
449
+ runtime_estimator:
450
+ _target_: composer.callbacks.runtime_estimator.RuntimeEstimator
451
+ optimizer_monitor:
452
+ _target_: composer.callbacks.OptimizerMonitor
453
+ nan_monitor:
454
+ _target_: composer.callbacks.NaNMonitor
455
+ generation_metrics:
456
+ _target_: prx.callbacks.LogQualityMetrics
457
+ frequency: 10_000ba
458
+ guidance_scales:
459
+ - 3.5
460
+ seed: 42
461
+ num_inference_steps: 50
462
+ compute_fid: true
463
+ compute_cmmd: true
464
+ compute_dino_mmd: true
465
+ max_samples: 10000
466
+ project: PRX
467
+ group: dose-response-full
468
+ name: C4
469
+ nccl_sleep: 1
470
+ activation_memory_budget: 1
471
+ image_size: 512
472
+ patch_size_pixels: 32
473
+ global_batch_size: 256
474
+ device_train_microbatch_size: 32
475
+ device_eval_microbatch_size: 16
476
+ seed: 42
477
+ eval_first: false
478
+ compile_denoiser: true
479
+ compile_vae: true
480
+ algorithms:
481
+ gradient_clipping:
482
+ _target_: composer.algorithms.GradientClipping
483
+ clipping_type: norm
484
+ clipping_threshold: 0.2
485
+ tread:
486
+ _target_: prx.algorithm.tread.Tread
487
+ route_start: 2
488
+ route_end: 12
489
+ routing_probability: 0.5
490
+ detach: false
491
+ seed: 42
492
+ train_only: true
493
+ self_guidance: true
494
+ repa:
495
+ _target_: prx.algorithm.repa.REPA
496
+ lambda_weight: 0.5
497
+ layer_index: 7
498
+ encoder: dinov3_vitl16
499
+ compile_encoder: true
500
+ lpips:
501
+ _target_: prx.algorithm.lpips.LPIPS
502
+ lpips_weight: 0.1
503
+ lpips_net: vgg
504
+ t_threshold: 1
505
+ resize_factor: 0.5
506
+ pdino:
507
+ _target_: prx.algorithm.perceptual_dino.PerceptualDINO
508
+ pdino_weight: 0.01
509
+ encoder: dinov2_vitb14_reg
510
+ t_threshold: 1
511
+ resize_resolution: 224
512
+ ema:
513
+ _target_: prx.algorithm.ema.EMA
514
+ smoothing: 0.999
515
+ update_interval: 10ba
516
+ ema_start: 0ba
517
+ model:
518
+ _target_: prx.pipeline.models_factory.build_pipeline
519
+ denoiser_config:
520
+ _model_class: PRX
521
+ in_channels: 3
522
+ patch_size: 32
523
+ context_in_dim: 2304
524
+ hidden_size: 1792
525
+ mlp_ratio: 3.5
526
+ num_heads: 28
527
+ depth: 16
528
+ axes_dim:
529
+ - 32
530
+ - 32
531
+ theta: 10000
532
+ time_factor: 1000.0
533
+ time_max_period: 10000
534
+ conditioning_block_ids: null
535
+ bottleneck_size: 256
536
+ text_tower_config:
537
+ preset_name: t5gemma2b-256-bf16
538
+ model_name: google/t5gemma-2b-2b-ul2
539
+ prompt_max_tokens: 256
540
+ use_attn_mask: true
541
+ use_last_hidden_state: true
542
+ only_tokenizer: false
543
+ torch_dtype: torch.bfloat16
544
+ unpadded: false
545
+ vae_config:
546
+ model_name: identity
547
+ model_class: IdentityVAE
548
+ default_channels: 3
549
+ torch_dtype: torch.bfloat16
550
+ scheduler_config:
551
+ prediction_type: x_prediction_flow_matching
552
+ num_train_timesteps: 1000
553
+ timestep_shift: 3.0
554
+ input_size: 512
555
+ p_drop_caption: 0.1
556
+ val_metrics:
557
+ - _target_: torchmetrics.MeanSquaredError
558
+ val_guidance_scales: []
559
+ loss_bins:
560
+ - - 0.0
561
+ - 0.3
562
+ - - 0.3
563
+ - 0.6
564
+ - - 0.6
565
+ - 1.0
566
+ scheduler:
567
+ _target_: composer.optim.MultiStepWithWarmupScheduler
568
+ t_warmup: 1000ba
569
+ milestones:
570
+ - 1e9ep
571
+ logger:
572
+ wandb:
573
+ _target_: composer.loggers.WandBLogger
574
+ project: PRX
575
+ group: dose-response-full
576
+ name: C4
577
+ trainer:
578
+ _target_: composer.Trainer
579
+ device: gpu
580
+ max_duration: 100_000ba
581
+ eval_interval: 0
582
+ eval_subset_num_batches: 64
583
+ device_train_microbatch_size: 32
584
+ run_name: dose-response-C4-full-phase1
585
+ seed: 42
586
+ scale_schedule_ratio: 1.0
587
+ save_folder: /checkpoint/dream/felixfriedrich/diffusion_safety/dose_response/checkpoints_full/C4/phase1
588
+ save_interval: 10_000ba
589
+ save_num_checkpoints_to_keep: 1
590
+ save_overwrite: true
591
+ save_weights_only: true
592
+ save_ignore_keys:
593
+ - state/model/vae*
594
+ - state/model/text_tower*
595
+ autoresume: false
596
+ precision: amp_bf16
597
+ dist_timeout: 7200.0
598
+ parallelism_config:
599
+ fsdp:
600
+ reshard_after_forward: false
601
+ device_mesh: mesh_2d
602
+ use_orig_params: true