|
|
--- |
|
|
license: creativeml-openrail-m |
|
|
base_model: runwayml/stable-diffusion-v1-5 |
|
|
library_name: onnx |
|
|
tags: |
|
|
- stable-diffusion |
|
|
- text-to-image |
|
|
- diffusion |
|
|
- webgpu |
|
|
- browser-ai |
|
|
- onnx |
|
|
- zhare-ai |
|
|
- client-side |
|
|
- privacy-preserving |
|
|
pipeline_tag: text-to-image |
|
|
inference: false |
|
|
widget: |
|
|
- text: "A beautiful sunset over mountains, digital art style" |
|
|
example_title: "Mountain Sunset" |
|
|
- text: "A futuristic cityscape with flying cars at night, cyberpunk" |
|
|
example_title: "Cyberpunk City" |
|
|
- text: "A serene lake surrounded by autumn trees, oil painting" |
|
|
example_title: "Autumn Lake" |
|
|
- text: "Portrait of a wise elderly person, studio lighting, photorealistic" |
|
|
example_title: "Portrait" |
|
|
model-index: |
|
|
- name: sd-1-5-webgpu |
|
|
results: |
|
|
- task: |
|
|
type: text-to-image |
|
|
name: Text-to-Image Generation |
|
|
dataset: |
|
|
name: Browser Performance Benchmark |
|
|
type: webgpu-inference |
|
|
metrics: |
|
|
- type: generation-time |
|
|
value: 3-45 |
|
|
name: Generation Time (seconds) |
|
|
config: 512x512, 20 steps, various hardware |
|
|
- type: memory-usage |
|
|
value: 4-6 |
|
|
name: VRAM Usage (GB) |
|
|
config: WebGPU acceleration |
|
|
- type: model-size |
|
|
value: 3.5 |
|
|
name: Total Model Size (GB) |
|
|
config: All ONNX components |
|
|
--- |
|
|
|
|
|
<div align="center"> |
|
|
<img src="zhare-logo.png" alt="Zhare-AI Logo" width="200" height="auto" style="margin-bottom: 20px;"> |
|
|
</div> |
|
|
|
|
|
# Stable Diffusion 1.5 WebGPU by Zhare-AI |
|
|
|
|
|
<div align="center"> |
|
|
|
|
|
 |
|
|
 |
|
|
 |
|
|
 |
|
|
|
|
|
**Privacy-preserving text-to-image generation in your browser with WebGPU acceleration** |
|
|
|
|
|
</div> |
|
|
|
|
|
This is a browser-optimized implementation of Stable Diffusion v1.5, specifically converted and optimized for client-side deployment using WebGPU acceleration. Developed by **Zhare-AI**, this model enables high-quality image generation directly in web browsers without requiring server infrastructure, ensuring complete user privacy and data sovereignty. |
|
|
|
|
|
<div align="center"> |
|
|
<img src="zhare-logo.png" alt="Zhare-AI - Democratizing AI" width="150" height="auto"> |
|
|
<p><em>Democratizing AI through distributed computing and privacy-preserving technology</em></p> |
|
|
</div> |
|
|
|
|
|
## π Key Features |
|
|
|
|
|
- π **Fully Client-Side**: Complete image generation in the browser, no data leaves your device |
|
|
- β‘ **WebGPU Accelerated**: Hardware-accelerated inference with automatic WebAssembly fallback |
|
|
- π **Privacy-First**: All processing happens locally, protecting user prompts and generated content |
|
|
- π± **Cross-Platform**: Compatible with desktop and mobile browsers |
|
|
- π οΈ **Production-Ready**: Optimized for real-world web applications |
|
|
|
|
|
## π Quick Start |
|
|
|
|
|
### Installation & Setup |
|
|
|
|
|
```bash |
|
|
# Clone or download the model |
|
|
git lfs install |
|
|
git clone https://huggingface.co/Zhare-AI/sd-1-5-webgpu |
|
|
``` |
|
|
|
|
|
## π Performance Specifications |
|
|
|
|
|
### Model Architecture |
|
|
|
|
|
| Component | Description | Approximate Size | |
|
|
|-----------|-------------|------------------| |
|
|
| **Text Encoder** | CLIP ViT-L/14 for text understanding | ~500MB | |
|
|
| **UNet** | Core diffusion model for image generation | ~3.4GB | |
|
|
| **VAE Decoder** | Converts latents to final images | ~160MB | |
|
|
| **VAE Encoder** | Encodes images to latent space | ~160MB | |
|
|
| **Safety Checker** | Content filtering (optional) | ~600MB | |
|
|
|
|
|
**Total Model Size**: ~4.8GB (without safety checker: ~4.2GB) |
|
|
|
|
|
### Browser Performance Benchmarks |
|
|
|
|
|
*Generation time for 512Γ512 images with 20 inference steps:* |
|
|
|
|
|
| Hardware Category | Example Device | Typical Performance | |
|
|
|------------------|----------------|-------------------| |
|
|
| **High-End Desktop** | RTX 4090, RTX 4080 | 3-8 seconds | |
|
|
| **Gaming Desktop** | RTX 3080, RTX 3070 | 8-15 seconds | |
|
|
| **Intel Arc GPUs** | Arc A750, Arc A770 | 8-15 seconds | |
|
|
| **AMD High-End** | RX 7900 XT/XTX | 6-12 seconds | |
|
|
| **Apple Silicon** | M2 Max, M1 Ultra | 10-20 seconds | |
|
|
| **Integrated GPUs** | Intel Iris Xe | 25-50 seconds | |
|
|
| **WebAssembly Fallback** | CPU-only devices | 2-10 minutes | |
|
|
|
|
|
### System Requirements |
|
|
|
|
|
- **Minimum VRAM**: 4GB (recommended: 6GB+) |
|
|
- **System RAM**: 8GB minimum, 16GB recommended |
|
|
- **Storage**: 5GB free space for model files |
|
|
- **Browser**: Chrome 113+, Edge 113+ (WebGPU), or any modern browser (WebAssembly fallback) |
|
|
|
|
|
## π Browser Compatibility |
|
|
|
|
|
| Browser | WebGPU Support | Performance Level | Notes | |
|
|
|---------|---------------|------------------|-------| |
|
|
| **Chrome 113+** | β
Full Support | Excellent | Primary recommendation | |
|
|
| **Microsoft Edge 113+** | β
Full Support | Excellent | Primary recommendation | |
|
|
| **Firefox 141+** | β
Stable Support | Very Good | Recent WebGPU implementation | |
|
|
| **Safari 17.4+** | πΆ Experimental | Good | Behind feature flag | |
|
|
| **Mobile Chrome 121+** | πΆ Limited | Fair | Android only, limited memory | |
|
|
|
|
|
*All browsers support WebAssembly fallback for universal compatibility* |
|
|
|
|
|
## π Model Details |
|
|
|
|
|
### Training Information |
|
|
|
|
|
This model is based on Stable Diffusion v1.5 with the following training characteristics: |
|
|
|
|
|
- **Base Dataset**: LAION-5B filtered subset (~590M image-text pairs) |
|
|
- **Training Resolution**: 512Γ512 pixels |
|
|
- **Architecture**: Latent Diffusion Model with CLIP ViT-L/14 text encoder |
|
|
- **Precision**: Originally trained in FP32, optimized to FP16 for browser deployment |
|
|
|
|
|
### Optimization for Web Deployment |
|
|
|
|
|
- **ONNX Conversion**: Optimized computational graph for web inference |
|
|
- **WebGPU Kernels**: Custom compute shaders for GPU acceleration |
|
|
- **Memory Efficiency**: Attention slicing and dynamic memory management |
|
|
- **Cross-Platform**: WebAssembly fallback ensures universal browser support |
|
|
|
|
|
## π‘οΈ Ethical Use and Safety |
|
|
|
|
|
### Built-in Safety Features |
|
|
|
|
|
- **Content Filter**: Optional NSFW detection and filtering |
|
|
- **Prompt Sanitization**: Basic filtering of potentially harmful prompts |
|
|
- **Local Processing**: No data transmission ensures privacy protection |
|
|
|
|
|
### Responsible Use Guidelines |
|
|
|
|
|
β
**Encouraged Uses:** |
|
|
- Creative art and design projects |
|
|
- Educational demonstrations of AI capabilities |
|
|
- Rapid prototyping for applications |
|
|
- Personal creative exploration |
|
|
- Research and development |
|
|
|
|
|
β **Prohibited Uses:** |
|
|
- Creating harmful, offensive, or illegal content |
|
|
- Generating misleading information or deepfakes |
|
|
- Violating copyright or intellectual property rights |
|
|
- Any use that violates the CreativeML OpenRAIL-M license terms |
|
|
|
|
|
### Privacy and Data Protection |
|
|
|
|
|
- **Zero Data Collection**: All processing occurs locally in your browser |
|
|
- **No Server Communication**: Model runs entirely offline after initial download |
|
|
- **User Control**: Complete control over generated content and prompts |
|
|
- **GDPR Compliant**: No personal data processing or storage |
|
|
|
|
|
## β οΈ Limitations and Considerations |
|
|
|
|
|
### Technical Limitations |
|
|
|
|
|
- **Resolution**: Optimized for 512Γ512 (other resolutions may reduce quality) |
|
|
- **Batch Size**: Single image generation only in browser environment |
|
|
- **Memory Constraints**: Limited by browser and device VRAM/RAM |
|
|
- **Generation Speed**: Slower than dedicated server hardware |
|
|
|
|
|
### Content Limitations |
|
|
|
|
|
- **Language Bias**: Best performance with English prompts |
|
|
- **Cultural Representation**: Training data may reflect Western/English-speaking biases |
|
|
- **Artistic Style**: Tendency toward photorealistic and digital art styles |
|
|
- **Consistency**: Multiple generations from same prompt may vary significantly |
|
|
|
|
|
### Browser-Specific Considerations |
|
|
|
|
|
- **WebGPU Availability**: Limited to supporting browsers and devices |
|
|
- **Memory Management**: Browser security limits may affect large model loading |
|
|
- **Performance Variance**: Significant variation across different devices and browsers |
|
|
|
|
|
## π License: CreativeML OpenRAIL-M |
|
|
|
|
|
This model is released under the **CreativeML OpenRAIL-M** license, which allows for: |
|
|
|
|
|
β
**Permitted:** |
|
|
- Commercial and non-commercial use |
|
|
- Distribution and modification |
|
|
- Creation of derivative works |
|
|
- Integration into applications and services |
|
|
|
|
|
π« **Restrictions:** |
|
|
- Must not be used to generate harmful content |
|
|
- Cannot be used for illegal activities |
|
|
- Must include license terms in any distribution |
|
|
- Derivative works must maintain the same license restrictions |
|
|
|
|
|
**Full License Text**: Available at [CreativeML OpenRAIL-M License](https://huggingface.co/spaces/CompVis/stable-diffusion-license) |
|
|
|
|
|
### License Compliance |
|
|
|
|
|
When using this model: |
|
|
1. **Include License**: Provide license terms to end users |
|
|
2. **Respect Restrictions**: Ensure use cases comply with content restrictions |
|
|
3. **Derivative Works**: Apply same license to modified versions |
|
|
4. **Attribution**: Credit original Stable Diffusion creators and Zhare-AI adaptation |
|
|
|
|
|
## π’ About Zhare-AI |
|
|
|
|
|
<div align="center"> |
|
|
<img src="zhare-logo.png" alt="Zhare-AI" width="120" height="auto" style="margin: 20px 0;"> |
|
|
</div> |
|
|
|
|
|
**Zhare-AI** is focused on democratizing AI technology by making powerful models accessible directly in web browsers. Our mission is to enable privacy-preserving AI applications that put users in control of their data and creative processes. |
|
|
|
|
|
- **Website**: [zhare.ai](https://zhare.ai) |
|
|
- **Focus**: Distributed AI computing and browser-based AI applications |
|
|
- **Philosophy**: Privacy-first, user-controlled AI experiences |
|
|
- **Vision**: Making AI accessible, private, and distributed |
|
|
|
|
|
### Our Mission |
|
|
|
|
|
We believe AI should be: |
|
|
- **Accessible** to everyone, regardless of infrastructure |
|
|
- **Private** with complete user data control |
|
|
- **Distributed** across devices rather than centralized servers |
|
|
- **Transparent** with open-source implementations |
|
|
|
|
|
## π Citation and References |
|
|
|
|
|
### Cite This Work |
|
|
|
|
|
```bibtex |
|
|
@misc{zhare-ai-sd15-webgpu-2025, |
|
|
title={Stable Diffusion 1.5 WebGPU: Browser-Optimized Text-to-Image Generation}, |
|
|
author={Zhare-AI}, |
|
|
year={2025}, |
|
|
howpublished={\url{https://huggingface.co/Zhare-AI/sd-1-5-webgpu}}, |
|
|
note={WebGPU-optimized implementation for privacy-preserving browser-based image generation} |
|
|
} |
|
|
``` |
|
|
|
|
|
### Original Stable Diffusion Citation |
|
|
|
|
|
```bibtex |
|
|
@InProceedings{Rombach_2022_CVPR, |
|
|
author = {Rombach, Robin and Blattmann, Andreas and Lorenz, Dominik and Esser, Patrick and Ommer, BjΓΆrn}, |
|
|
title = {High-Resolution Image Synthesis With Latent Diffusion Models}, |
|
|
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, |
|
|
month = {June}, |
|
|
year = {2022}, |
|
|
pages = {10684-10695} |
|
|
} |
|
|
``` |
|
|
|
|
|
## π€ Community and Support |
|
|
|
|
|
### Getting Help |
|
|
|
|
|
- **Issues**: Report technical problems via the repository issues |
|
|
- **Discussions**: Join the community discussion for tips and examples |
|
|
- **Documentation**: Comprehensive guides available in the repository |
|
|
|
|
|
### Contributing |
|
|
|
|
|
We welcome contributions to improve browser compatibility, performance, and user experience: |
|
|
|
|
|
- Performance optimizations for different hardware |
|
|
- Browser compatibility improvements |
|
|
- Documentation enhancements |
|
|
- Example applications and tutorials |
|
|
|
|
|
--- |
|
|
|
|
|
<div align="center"> |
|
|
<img src="zhare-logo.png" alt="Zhare-AI" width="100" height="auto"> |
|
|
|
|
|
**π Ready to create amazing images directly in your browser?** |
|
|
|
|
|
*This model brings the power of Stable Diffusion to web applications while keeping your data completely private and secure.* |
|
|
|
|
|
**Developed with β€οΈ by Zhare-AI for the open-source community** |
|
|
|
|
|
[π Visit Zhare.ai](https://zhare.ai) | [π§ Contact Us](mailto:contact@zhare.ai) | [π¬ Join Discussion](https://github.com/Zhare-AI) |
|
|
|
|
|
</div> |