karthik-2905
/

DiffusionPretrained

Model card Files Files and versions

xet

Community

karthik-2905 commited on Jul 19, 2025

Commit

42c78f7

verified ·

1 Parent(s): da54531

Rename readme.md to README.md

Browse files

Files changed (2) hide show

README.md +157 -0
readme.md +0 -92

README.md ADDED Viewed

	@@ -0,0 +1,157 @@

+# CIFAR-10 Diffusion Model
+A lightweight diffusion model trained from scratch on the CIFAR-10 dataset in just 14.5 minutes using PyTorch.
+## Model Description
+This is a **SimpleUNet-based diffusion model** trained to generate 32x32 RGB images similar to the CIFAR-10 dataset. The model demonstrates the fundamentals of diffusion-based image generation with a compact architecture suitable for educational purposes and quick experimentation.
+### Key Features
+- 🚀 **Fast Training**: Complete training in under 15 minutes on RTX 3060
+- 💾 **Lightweight**: Only 16.8M parameters (~64MB model size)
+- 🎯 **Educational**: Clean, well-documented code for learning diffusion models
+- ⚡ **Efficient Inference**: Generate images in seconds on consumer GPUs
+## Model Details
+| Attribute | Value |
+|-----------|-------|
+| **Architecture** | SimpleUNet with ResNet blocks + Attention |
+| **Parameters** | 16,808,835 |
+| **Dataset** | CIFAR-10 (50,000 training images) |
+| **Image Size** | 32×32 RGB |
+| **Training Steps** | 7,820 (20 epochs × 391 batches) |
+| **Training Time** | 14.54 minutes |
+| **Hardware** | NVIDIA RTX 3060 (0.43GB VRAM used) |
+| **Framework** | PyTorch 2.0+ |
+## Quick Start
+### Installation
+```bash
+pip install torch torchvision matplotlib tqdm pillow numpy
+```
+### Basic Usage
+```python
+import torch
+import matplotlib.pyplot as plt
+# Load model
+checkpoint = torch.load('complete_diffusion_model.pth')
+model = SimpleUNet(**checkpoint['model_config'])
+model.load_state_dict(checkpoint['model_state_dict'])
+model.eval()
+# Initialize scheduler
+scheduler = DDPMScheduler(**checkpoint['diffusion_config'])
+# Generate images
+@torch.no_grad()
+def generate_images(model, scheduler, num_images=4):
+    device = next(model.parameters()).device
+    images = torch.randn(num_images, 3, 32, 32).to(device)
+    for t in range(999, -1, -20):  # 50 denoising steps
+        timestep = torch.full((num_images,), t, device=device)
+        noise_pred = model(images, timestep)
+        # Simplified DDPM step
+        alpha_t = scheduler.alpha_cumprod[t]
+        alpha_prev = scheduler.alpha_cumprod[t-20] if t >= 20 else 1.0
+        pred_x0 = (images - torch.sqrt(1-alpha_t) * noise_pred) / torch.sqrt(alpha_t)
+        images = torch.sqrt(alpha_prev) * pred_x0 + torch.sqrt(1-alpha_prev) * noise_pred
+    return images
+# Generate and display
+generated = generate_images(model, scheduler)
+```
+## Training Details
+- **Loss Function**: MSE between predicted and actual noise
+- **Optimizer**: AdamW (lr=1e-4, weight_decay=1e-6)
+- **Scheduler**: CosineAnnealingLR
+- **Batch Size**: 128
+- **Final Loss**: 0.0363 (73% reduction from initial)
+- **Diffusion Steps**: 1000 (linear beta schedule)
+## Performance
+### Training Loss Curve
+The model shows excellent convergence:
+- **Epoch 1**: 0.1349 → **Epoch 20**: 0.0363
+- **Best Loss**: 0.0358 (Epoch 19)
+- **Stable convergence** without overfitting
+### Generation Quality
+- ✅ Captures CIFAR-10 color distributions
+- ✅ Generates diverse, non-repetitive outputs
+- ⚠️ Abstract patterns (needs longer training for object recognition)
+- 🎯 Suitable for color/texture generation tasks
+## Files in this Repository
+| File | Description | Size |
+|------|-------------|------|
+| `complete_diffusion_model.pth` | Full model with config and weights | ~64MB |
+| `diffusion_model_final.pth` | Training checkpoint (epoch 20) | ~64MB |
+| `model_info.json` | Training metadata and hyperparameters | <1KB |
+| `inference_example.py` | Complete inference script with model classes | ~5KB |
+## Model Architecture
+```
+SimpleUNet(
+  time_embedding: TimeEmbedding(128)
+  encoder: 3 ResNet blocks with downsampling
+  middle: ResNet + Self-Attention + ResNet
+  decoder: 3 ResNet blocks with upsampling
+  output: GroupNorm → SiLU → Conv2d
+)
+```
+## Use Cases
+- 🎓 **Educational**: Learn diffusion model fundamentals
+- 🔬 **Research**: Baseline for diffusion experiments
+- 🎨 **Art**: Generate abstract textures and patterns
+- ⚡ **Prototyping**: Quick diffusion model testing
+## Limitations & Improvements
+### Current Limitations
+- Generates abstract patterns rather than recognizable objects
+- Trained on small 32×32 resolution
+- Limited to 20 training epochs
+### Suggested Improvements
+1. **Extended Training**: 50-100 epochs for better object generation
+2. **Larger Architecture**: Increase model capacity
+3. **Advanced Sampling**: Implement DDIM or DPM-Solver++
+4. **Higher Resolution**: Train on 64×64 or 128×128 images
+5. **Better Datasets**: Use CelebA-HQ or custom datasets
+## Citation
+```bibtex
+@misc{cifar10-diffusion-2025,
+  title={CIFAR-10 Diffusion Model: Fast Training Implementation},
+  author={Karthik},
+  year={2025},
+  publisher={Hugging Face},
+  howpublished={\url{https://huggingface.co/karthik-2905/DiffusionPretrained}}
+}
+```
+## License
+MIT License - Free for research and commercial use.
+---
+**🚀 Want to train your own?** Check out the [full implementation](https://github.com/GruheshKurra/DiffusionModelPretrained) with Jupyter notebooks and step-by-step training code!
+**📊 Training Stats**: 16.8M params • 14.5min training • RTX 3060 • PyTorch 2.0

readme.md DELETED Viewed

@@ -1,92 +0,0 @@
-# CIFAR-10 Diffusion Model
-🎨 **A diffusion model trained from scratch on CIFAR-10 dataset**
-## Model Details
-- **Architecture**: SimpleUNet with 16.8M parameters
-- **Dataset**: CIFAR-10 (50,000 training images)
-- **Training Time**: 14.54 minutes on RTX 3060
-- **Final Loss**: 0.0363
-- **Image Size**: 32x32 RGB
-- **Framework**: PyTorch
-## Quick Start
-```python
-import torch
-from model import SimpleUNet, DDPMScheduler, generate_images
-# Load the trained model
-checkpoint = torch.load('complete_diffusion_model.pth')
-model = SimpleUNet(**checkpoint['model_config'])
-model.load_state_dict(checkpoint['model_state_dict'])
-model.eval()
-# Initialize scheduler
-scheduler = DDPMScheduler(**checkpoint['diffusion_config'])
-# Generate images
-generated_images = generate_images(model, scheduler, num_images=8)
-```
-## Installation
-```bash
-pip install torch>=2.0.0 torchvision>=0.15.0 matplotlib tqdm pillow numpy
-```
-## Files Included
-- `complete_diffusion_model.pth` - Complete model with config (64MB)
-- `model_info.json` - Training details and metadata
-- `diffusion_model_final.pth` - Final training checkpoint (64MB)
-- `inference_example.py` - Ready-to-use inference script
-## Training Details
-- **Epochs**: 20
-- **Batch Size**: 128
-- **Learning Rate**: 1e-4 (CosineAnnealingLR)
-- **Optimizer**: AdamW
-- **GPU**: NVIDIA RTX 3060 (0.43GB VRAM used)
-- **Loss Reduction**: 73% (from 0.1349 to 0.0363)
-## Hardware Requirements
-- **Minimum**: 1GB VRAM for inference
-- **Recommended**: 2GB+ VRAM for training extensions
-- **CPU**: Works but slower
-## Results
-The model generates colorful abstract patterns that capture CIFAR-10's color distributions.
-With more training epochs (50-100), it should produce more recognizable objects.
-## Improvements
-To get better results:
-1. **Train longer**: 50-100 epochs instead of 20
-2. **Larger model**: Increase channels/layers
-3. **Advanced sampling**: DDIM, DPM-Solver
-4. **Better datasets**: CelebA, ImageNet
-5. **Learning rate**: Experiment with schedules
-## Model Architecture
-- **U-Net based** with ResNet blocks
-- **Time embedding** for diffusion timesteps
-- **Attention layers** at multiple resolutions
-- **Skip connections** for better gradient flow
-## Citation
-```bibtex
-@misc{cifar10-diffusion-2025,
-  title={CIFAR-10 Diffusion Model},
-  author={Your Name},
-  year={2025},
-  url={https://github.com/your-username/cifar10-diffusion}
-}
-```
-## License
-MIT License - Feel free to use and modify!
----
-**Created**: July 19, 2025
-**Training Time**: 14.54 minutes
-**GPU**: NVIDIA RTX 3060
-**Framework**: PyTorch