Update README.md

e65a5a6 verified 3 months ago

5.52 kB

	---
	tags:
	- text-to-image
	- lora
	- diffusers
	- template:diffusion-lora
	base_model: black-forest-labs/FLUX.1-Kontext-dev
	instance_prompt: >-
	[photo content], recreate the scene from a top-down perspective. Maintain all
	visual proportions, lighting consistency, and realistic spatial relationships.
	Ensure the background, textures, and environmental shadows remain naturally
	aligned from this elevated angle.
	license: other
	license_name: flux-1-dev-non-commercial-license
	license_link: LICENSE.md
	language:
	- en
	pipeline_tag: image-to-image
	library_name: diffusers
	---

	![1](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/gkn6DvNaQn14GgbhHgq5v.png)

	# Kontext-Top-Down-View

	The Kontext-Top-Down-View is an experimental adapter for black-forest-lab's FLUX.1-Kontext-dev, designed to transform scenes into a top-down perspective while maintaining accurate visual proportions, consistent lighting, and realistic spatial relationships. The model ensures that backgrounds, textures, and environmental details remain natural and contextually coherent, producing high-quality, perspective-accurate visual outputs. It was trained on 800 image pairs (400 start images and 400 end images) to achieve precise, geometry-consistent top-down scene generation.

	> [!note]
	[photo content], recreate the scene from a top-down perspective. Maintain all visual proportions, lighting consistency, and realistic spatial relationships. Ensure the background, textures, and environmental shadows remain naturally aligned from this elevated angle.

	> You modified the prompt, altering its properties and subjective elements. Note: this is an experimental adapter and may contain artifacts.

	---

	## Sample Inferences : Demo

	<table style="width:100%; border-collapse:collapse;">
	<tr>
	<td style="width:50%; text-align:center;">
	<img src="https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/O9hti3lQODGSiZLGPm811.jpeg"
	alt="Kontext-Unblur-Upscale" style="width:100%; height:auto;"/>
	</td>
	<td style="width:50%; text-align:center;">
	<img src="https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/iH52aQZ7BA6Gdnmj2rkgX.webp"
	alt="Kontext-Top-Down-View" style="width:100%; height:auto;"/>
	</td>
	</tr>
	</table>

	<table style="width:100%; border-collapse:collapse;">
	<tr>
	<td style="width:50%; text-align:center;">
	<img src="https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/N_nMU9x0hnb4HAdchJtQC.jpeg"
	alt="Kontext-Unblur-Upscale" style="width:100%; height:auto;"/>
	</td>
	<td style="width:50%; text-align:center;">
	<img src="https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/r_hw2cwckPCfapUZyHe9c.webp"
	alt="Kontext-Top-Down-View" style="width:100%; height:auto;"/>
	</td>
	</tr>
	</table>

	---


	## Parameter Settings

	\| Setting \| Value \|
	\| ------------------------ \| ------------------------ \|
	\| Module Type \| Adapter \|
	\| Base Model \| FLUX.1 Kontext Dev - fp8 \|
	\| Trigger Words \| [photo content], recreate the scene from a top-down perspective. Maintain all visual proportions, lighting consistency, and realistic spatial relationships. Ensure the background, textures, and environmental shadows remain naturally aligned from this elevated angle. \|
	\| Image Processing Repeats \| 50 \|
	\| Epochs \| 25 \|
	\| Save Every N Epochs \| 1 \|

	Labeling: DeepCaption-VLA-7B(natural language & English)

	Total Images Used for Training : 800 Image Pairs (400 Start, 400 End)

	## Training Parameters

	\| Setting \| Value \|
	\| --------------------------- \| --------- \|
	\| Seed \| - \|
	\| Clip Skip \| - \|
	\| Text Encoder LR \| 0.00001 \|
	\| UNet LR \| 0.00005 \|
	\| LR Scheduler \| constant \|
	\| Optimizer \| AdamW8bit \|
	\| Network Dimension \| 64 \|
	\| Network Alpha \| 32 \|
	\| Gradient Accumulation Steps \| - \|

	## Label Parameters

	\| Setting \| Value \|
	\| --------------- \| ----- \|
	\| Shuffle Caption \| - \|
	\| Keep N Tokens \| - \|

	## Advanced Parameters

	\| Setting \| Value \|
	\| ------------------------- \| ----- \|
	\| Noise Offset \| 0.03 \|
	\| Multires Noise Discount \| 0.1 \|
	\| Multires Noise Iterations \| 10 \|
	\| Conv Dimension \| - \|
	\| Conv Alpha \| - \|
	\| Batch Size \| - \|
	\| Steps \| 3800 & 400(warm up) \|
	\| Sampler \| euler \|

	---

	## Trigger words

	You should use `[photo content]` to trigger the image generation.

	You should use `recreate the scene from a top-down perspective. Maintain all visual proportions` to trigger the image generation.

	You should use `lighting consistency` to trigger the image generation.

	You should use `and realistic spatial relationships. Ensure the background` to trigger the image generation.

	You should use `textures` to trigger the image generation.

	You should use `and environmental shadows remain naturally aligned from this elevated angle.` to trigger the image generation.


	## Download model

	[Download](/prithivMLmods/Kontext-Top-Down-View/tree/main) them in the Files & versions tab.