kromic's picture
Update README.md
a3df2fa verified
|
raw
history blame
1.57 kB
metadata
license: apache-2.0
library_name: diffusers
tags:
  - stable-diffusion
  - data-augmentation
  - crosswalk-segmentation
inference: true
datasets:
  - custom
base_model:
  - CompVis/stable-diffusion-v1-4

Model Description

This model is a fine-tuned Stable Diffusion model to generate realistic pedestrian-perspective images of crosswalks. It was fine-tuned on a dataset of 150 first-person view (FPV) images, primarily captured in sunny conditions, to enable controlled text-to-image generation for data augmentation in crosswalk segmentation tasks.

  • Base model: Stable Diffusion v1.4
  • Fine-tuning method: Text-to-image fine-tuning using custom FPV crosswalk dataset
  • Components:
    • unet — fine-tuned U-Net weights
    • vae — fine-tuned VAE weights
  • Intended use: Synthetic data generation for semantic segmentation augmentation

Use Cases

  • Data augmentation for crosswalk segmentation models
  • Generating diverse weather and lighting scenarios (e.g., fog, rain, snow, night) from text prompts
  • Research on assistive navigation systems for visually impaired pedestrians
  • Benchmarking model generalization across diverse environments

How to Use

You can generate images with the provided Python inference script:

# Clone the repository
git clone https://huggingface.co/kromic/sd-crosswalk-augmentation
cd sd-crosswalk-augmentation

# Install dependencies
pip install diffusers transformers torch

# Run inference
python generate.py

# Customize your prompt
prompt = "a crosswalk image"