import torch
from diffusers import DiffusionPipeline
# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("kromic/sd-crosswalk-augmentation", dtype=torch.bfloat16, device_map="cuda")
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipe(prompt).images[0]Model Description
This model is a fine-tuned Stable Diffusion model to generate realistic pedestrian-perspective images of crosswalks. It was fine-tuned on a dataset of 150 first-person view (FPV) images, primarily captured in sunny conditions, to enable controlled text-to-image generation for data augmentation in crosswalk segmentation tasks.
- Base model: Stable Diffusion v1.4
- Fine-tuning method: Text-to-image fine-tuning using custom FPV crosswalk dataset
- Components:
unetโ fine-tuned U-Net weightsvaeโ fine-tuned VAE weights
- Intended use: Synthetic data generation for semantic segmentation augmentation
Use Cases
- Data augmentation for crosswalk segmentation models
- Generating diverse weather and lighting scenarios (e.g., fog, rain, snow, night) from text prompts
- Research on assistive navigation systems for visually impaired pedestrians
- Benchmarking model generalization across diverse environments
How to Use
You can generate images with the provided Python inference script:
# Clone the repository
git clone https://huggingface.co/kromic/sd-crosswalk-augmentation
cd sd-crosswalk-augmentation
# Install dependencies
pip install diffusers transformers torch
# Run inference
python generate.py
# Customize your prompt
prompt = "a crosswalk image"
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support
Model tree for kromic/sd-crosswalk-augmentation
Base model
CompVis/stable-diffusion-v1-4