Update README.md
Browse files
README.md
CHANGED
|
@@ -4,10 +4,6 @@ tags:
|
|
| 4 |
- lora
|
| 5 |
- diffusers
|
| 6 |
- template:diffusion-lora
|
| 7 |
-
widget:
|
| 8 |
-
- output:
|
| 9 |
-
url: images/vvvvvvvvvvvvv.png
|
| 10 |
-
text: '-'
|
| 11 |
base_model: black-forest-labs/FLUX.1-Kontext-dev
|
| 12 |
instance_prompt: >-
|
| 13 |
[photo content], recreate the scene from a top-down perspective. Maintain all
|
|
@@ -20,13 +16,67 @@ language:
|
|
| 20 |
pipeline_tag: image-to-image
|
| 21 |
library_name: diffusers
|
| 22 |
---
|
| 23 |
-
# Kontext-Top-Down-View
|
| 24 |
|
| 25 |
-
|
| 26 |
|
| 27 |
> [!note]
|
| 28 |
[photo content], recreate the scene from a top-down perspective. Maintain all visual proportions, lighting consistency, and realistic spatial relationships. Ensure the background, textures, and environmental shadows remain naturally aligned from this elevated angle.
|
| 29 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 30 |
## Trigger words
|
| 31 |
|
| 32 |
You should use `[photo content]` to trigger the image generation.
|
|
@@ -44,5 +94,4 @@ You should use `and environmental shadows remain naturally aligned from this ele
|
|
| 44 |
|
| 45 |
## Download model
|
| 46 |
|
| 47 |
-
|
| 48 |
[Download](/prithivMLmods/Kontext-Top-Down-View/tree/main) them in the Files & versions tab.
|
|
|
|
| 4 |
- lora
|
| 5 |
- diffusers
|
| 6 |
- template:diffusion-lora
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
base_model: black-forest-labs/FLUX.1-Kontext-dev
|
| 8 |
instance_prompt: >-
|
| 9 |
[photo content], recreate the scene from a top-down perspective. Maintain all
|
|
|
|
| 16 |
pipeline_tag: image-to-image
|
| 17 |
library_name: diffusers
|
| 18 |
---
|
| 19 |
+
# **Kontext-Top-Down-View**
|
| 20 |
|
| 21 |
+
The Kontext-Top-Down-View is an adapter for black-forest-lab's FLUX.1-Kontext-dev, designed to transform scenes into a top-down perspective while maintaining accurate visual proportions, consistent lighting, and realistic spatial relationships. The model ensures that backgrounds, textures, and environmental details remain natural and contextually coherent, producing high-quality, perspective-accurate visual outputs. It was trained on 800 image pairs (400 start images and 400 end images) to achieve precise, geometry-consistent top-down scene generation.
|
| 22 |
|
| 23 |
> [!note]
|
| 24 |
[photo content], recreate the scene from a top-down perspective. Maintain all visual proportions, lighting consistency, and realistic spatial relationships. Ensure the background, textures, and environmental shadows remain naturally aligned from this elevated angle.
|
| 25 |
|
| 26 |
+
|
| 27 |
+
---
|
| 28 |
+
|
| 29 |
+
## Parameter Settings
|
| 30 |
+
|
| 31 |
+
| Setting | Value |
|
| 32 |
+
| ------------------------ | ------------------------ |
|
| 33 |
+
| Module Type | Adapter |
|
| 34 |
+
| Base Model | FLUX.1 Kontext Dev - fp8 |
|
| 35 |
+
| Trigger Words | [photo content], upscale the low-quality image to 4K resolution, enhancing sharpness, clarity, and fine details while preserving the original texture, colors, lighting, and natural appearance. Remove noise, blur, and compression artifacts without over-smoothing or distorting facial or object features. Ensure realistic depth, balanced contrast, and accurate tones, achieving a high-definition, lifelike result that maintains the integrity of the original image. |
|
| 36 |
+
| Image Processing Repeats | 50 |
|
| 37 |
+
| Epochs | 25 |
|
| 38 |
+
| Save Every N Epochs | 1 |
|
| 39 |
+
|
| 40 |
+
Labeling: DeepCaption-VLA-7B(natural language & English)
|
| 41 |
+
|
| 42 |
+
Total Images Used for Training : 800 Image Pairs (400 Start, 400 End)
|
| 43 |
+
|
| 44 |
+
## Training Parameters
|
| 45 |
+
|
| 46 |
+
| Setting | Value |
|
| 47 |
+
| --------------------------- | --------- |
|
| 48 |
+
| Seed | - |
|
| 49 |
+
| Clip Skip | - |
|
| 50 |
+
| Text Encoder LR | 0.00001 |
|
| 51 |
+
| UNet LR | 0.00005 |
|
| 52 |
+
| LR Scheduler | constant |
|
| 53 |
+
| Optimizer | AdamW8bit |
|
| 54 |
+
| Network Dimension | 64 |
|
| 55 |
+
| Network Alpha | 32 |
|
| 56 |
+
| Gradient Accumulation Steps | - |
|
| 57 |
+
|
| 58 |
+
## Label Parameters
|
| 59 |
+
|
| 60 |
+
| Setting | Value |
|
| 61 |
+
| --------------- | ----- |
|
| 62 |
+
| Shuffle Caption | - |
|
| 63 |
+
| Keep N Tokens | - |
|
| 64 |
+
|
| 65 |
+
## Advanced Parameters
|
| 66 |
+
|
| 67 |
+
| Setting | Value |
|
| 68 |
+
| ------------------------- | ----- |
|
| 69 |
+
| Noise Offset | 0.03 |
|
| 70 |
+
| Multires Noise Discount | 0.1 |
|
| 71 |
+
| Multires Noise Iterations | 10 |
|
| 72 |
+
| Conv Dimension | - |
|
| 73 |
+
| Conv Alpha | - |
|
| 74 |
+
| Batch Size | - |
|
| 75 |
+
| Steps | 3800 & 400(warm up) |
|
| 76 |
+
| Sampler | euler |
|
| 77 |
+
|
| 78 |
+
---
|
| 79 |
+
|
| 80 |
## Trigger words
|
| 81 |
|
| 82 |
You should use `[photo content]` to trigger the image generation.
|
|
|
|
| 94 |
|
| 95 |
## Download model
|
| 96 |
|
|
|
|
| 97 |
[Download](/prithivMLmods/Kontext-Top-Down-View/tree/main) them in the Files & versions tab.
|