metadata
license: apache-2.0
library_name: diffusers
tags:
- stable-diffusion
- data-augmentation
- crosswalk-segmentation
inference: true
datasets:
- custom
base_model:
- CompVis/stable-diffusion-v1-4
Model Description
This model is a fine-tuned Stable Diffusion model to generate realistic pedestrian-perspective images of crosswalks. It was fine-tuned on a dataset of 150 first-person view (FPV) images, primarily captured in sunny conditions, to enable controlled text-to-image generation for data augmentation in crosswalk segmentation tasks.
- Base model: Stable Diffusion v1.4
- Fine-tuning method: Text-to-image fine-tuning using custom FPV crosswalk dataset
- Components:
unet— fine-tuned U-Net weightsvae— fine-tuned VAE weights
- Intended use: Synthetic data generation for semantic segmentation augmentation
Use Cases
- Data augmentation for crosswalk segmentation models
- Generating diverse weather and lighting scenarios (e.g., fog, rain, snow, night) from text prompts
- Research on assistive navigation systems for visually impaired pedestrians
- Benchmarking model generalization across diverse environments
How to Use
You can generate images with the provided Python inference script:
# Clone the repository
git clone https://huggingface.co/kromic/sd-crosswalk-augmentation
cd sd-crosswalk-augmentation
# Install dependencies
pip install diffusers transformers torch
# Run inference
python generate.py
# Customize your prompt
prompt = "a crosswalk image"