Update README.md

a3df2fa verified 6 months ago

1.57 kB

license: apache-2.0
library_name: diffusers
tags:
  - stable-diffusion
  - data-augmentation
  - crosswalk-segmentation
inference: true
datasets:
  - custom
base_model:
  - CompVis/stable-diffusion-v1-4

Model Description

This model is a fine-tuned Stable Diffusion model to generate realistic pedestrian-perspective images of crosswalks. It was fine-tuned on a dataset of 150 first-person view (FPV) images, primarily captured in sunny conditions, to enable controlled text-to-image generation for data augmentation in crosswalk segmentation tasks.

Base model: Stable Diffusion v1.4
Fine-tuning method: Text-to-image fine-tuning using custom FPV crosswalk dataset
Components:
- unet — fine-tuned U-Net weights
- vae — fine-tuned VAE weights
Intended use: Synthetic data generation for semantic segmentation augmentation

Use Cases

Data augmentation for crosswalk segmentation models
Generating diverse weather and lighting scenarios (e.g., fog, rain, snow, night) from text prompts
Research on assistive navigation systems for visually impaired pedestrians
Benchmarking model generalization across diverse environments

How to Use

You can generate images with the provided Python inference script:

# Clone the repository
git clone https://huggingface.co/kromic/sd-crosswalk-augmentation
cd sd-crosswalk-augmentation

# Install dependencies
pip install diffusers transformers torch

# Run inference
python generate.py

# Customize your prompt
prompt = "a crosswalk image"