File size: 5,523 Bytes

---
tags:
- text-to-image
- lora
- diffusers
- template:diffusion-lora
base_model: black-forest-labs/FLUX.1-Kontext-dev
instance_prompt: >-
  [photo content], recreate the scene from a top-down perspective. Maintain all
  visual proportions, lighting consistency, and realistic spatial relationships.
  Ensure the background, textures, and environmental shadows remain naturally
  aligned from this elevated angle.
license: other
license_name: flux-1-dev-non-commercial-license
license_link: LICENSE.md
language:
- en
pipeline_tag: image-to-image
library_name: diffusers
---

![1](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/gkn6DvNaQn14GgbhHgq5v.png)

# **Kontext-Top-Down-View**

The Kontext-Top-Down-View is an experimental adapter for black-forest-lab's FLUX.1-Kontext-dev, designed to transform scenes into a top-down perspective while maintaining accurate visual proportions, consistent lighting, and realistic spatial relationships. The model ensures that backgrounds, textures, and environmental details remain natural and contextually coherent, producing high-quality, perspective-accurate visual outputs. It was trained on 800 image pairs (400 start images and 400 end images) to achieve precise, geometry-consistent top-down scene generation.

> [!note]
[photo content], recreate the scene from a top-down perspective. Maintain all visual proportions, lighting consistency, and realistic spatial relationships. Ensure the background, textures, and environmental shadows remain naturally aligned from this elevated angle.

> You modified the prompt, altering its properties and subjective elements. Note: this is an experimental adapter and may contain artifacts.

---

## **Sample Inferences : Demo**

<table style="width:100%; border-collapse:collapse;">
  <tr>
    <td style="width:50%; text-align:center;">
      <img src="https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/O9hti3lQODGSiZLGPm811.jpeg" 
           alt="Kontext-Unblur-Upscale" style="width:100%; height:auto;"/>
    </td>
    <td style="width:50%; text-align:center;">
      <img src="https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/iH52aQZ7BA6Gdnmj2rkgX.webp" 
           alt="Kontext-Top-Down-View" style="width:100%; height:auto;"/>
    </td>
  </tr>
</table>

<table style="width:100%; border-collapse:collapse;">
  <tr>
    <td style="width:50%; text-align:center;">
      <img src="https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/N_nMU9x0hnb4HAdchJtQC.jpeg" 
           alt="Kontext-Unblur-Upscale" style="width:100%; height:auto;"/>
    </td>
    <td style="width:50%; text-align:center;">
      <img src="https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/r_hw2cwckPCfapUZyHe9c.webp" 
           alt="Kontext-Top-Down-View" style="width:100%; height:auto;"/>
    </td>
  </tr>
</table>

---


## Parameter Settings

| Setting                  | Value                    |
| ------------------------ | ------------------------ |
| Module Type              | Adapter                     |
| Base Model               | FLUX.1 Kontext Dev - fp8 |
| Trigger Words            | [photo content], recreate the scene from a top-down perspective. Maintain all visual proportions, lighting consistency, and realistic spatial relationships. Ensure the background, textures, and environmental shadows remain naturally aligned from this elevated angle. |
| Image Processing Repeats | 50                       |
| Epochs                   | 25                       |
| Save Every N Epochs      | 1                        |

    Labeling: DeepCaption-VLA-7B(natural language & English)
    
    Total Images Used for Training : 800 Image Pairs (400 Start, 400 End)

## Training Parameters

| Setting                     | Value     |
| --------------------------- | --------- |
| Seed                        | -         |
| Clip Skip                   | -         |
| Text Encoder LR             | 0.00001   |
| UNet LR                     | 0.00005   |
| LR Scheduler                | constant  |
| Optimizer                   | AdamW8bit |
| Network Dimension           | 64        |
| Network Alpha               | 32        |
| Gradient Accumulation Steps | -         |

## Label Parameters

| Setting         | Value |
| --------------- | ----- |
| Shuffle Caption | -     |
| Keep N Tokens   | -     |

## Advanced Parameters

| Setting                   | Value |
| ------------------------- | ----- |
| Noise Offset              | 0.03  |
| Multires Noise Discount   | 0.1   |
| Multires Noise Iterations | 10    |
| Conv Dimension            | -     |
| Conv Alpha                | -     |
| Batch Size                | -     |
| Steps   | 3800 & 400(warm up)  |
| Sampler | euler |

---

## Trigger words

You should use `[photo content]` to trigger the image generation.

You should use `recreate the scene from a top-down perspective. Maintain all visual proportions` to trigger the image generation.

You should use `lighting consistency` to trigger the image generation.

You should use `and realistic spatial relationships. Ensure the background` to trigger the image generation.

You should use `textures` to trigger the image generation.

You should use `and environmental shadows remain naturally aligned from this elevated angle.` to trigger the image generation.


## Download model

[Download](/prithivMLmods/Kontext-Top-Down-View/tree/main) them in the Files & versions tab.