Spaces:

JahnaviBhansali
/

MobileNetDemo

Build error

App Files Files Community

MobileNetDemo / README.md

Jahnavibh

Update Space title and remove Gradio footer

583aa78 9 months ago

preview code

raw

history blame contribute delete

5.53 kB

A newer version of the Gradio SDK is available: 6.12.0

Upgrade

metadata

title: Person Classification Demo
emoji: 📱
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false

📱 MobileNetV2 Image Classification Demo

A lightweight, interactive image classification demo built with Hugging Face Transformers and Gradio. This demo replicates the clean interface style of popular Whisper demos but for computer vision classification tasks.

✨ Features

Mobile-Optimized Model: Uses MobileNetV2, designed specifically for efficient mobile and edge deployment
Interactive Web Interface: Clean, modern Gradio interface similar to Whisper demos
Real-time Classification: Instant image classification with top-5 predictions
1000 ImageNet Classes: Recognizes a wide variety of objects, animals, vehicles, and scenes
Confidence Scores: Shows prediction confidence as percentages
Example Images: Pre-loaded example images for quick testing
Responsive Design: Works seamlessly on desktop and mobile devices
Lightweight: Only 3.4M parameters for fast inference

🚀 Quick Start

Option 1: Hugging Face Spaces (Recommended)

Deploy instantly to Hugging Face Spaces:

Create a new Space on Hugging Face Spaces
Upload these files: app.py, requirements.txt, README.md
Your demo will be live automatically!

Option 2: Local Development

# Clone this repository
git clone <your-repo-url>
cd mobilenetv2-classification-demo

# Install dependencies
pip install -r requirements.txt

# Run the demo
python app.py

The demo will be available at http://localhost:7860

🎯 How to Use

Upload Image: Click on the upload area or drag and drop an image
Get Results: Classification happens automatically, showing top-5 predictions
Try Examples: Use the example buttons to test with sample images
View Confidence: Each prediction shows a confidence percentage

📊 Model Information

Model: google/mobilenet_v2_1.0_224
Architecture: MobileNetV2 with 1.0 width multiplier
Input Size: 224×224 pixels
Parameters: 3.4 million (lightweight!)
Classes: 1,000 ImageNet categories
Optimization: Designed for mobile and edge devices

🔧 Technical Details

Dependencies

PyTorch: Deep learning framework
Transformers: Hugging Face model library
Gradio: Web interface framework
Pillow: Image processing
NumPy: Numerical computing

Model Architecture

MobileNetV2 uses:

Depthwise Separable Convolutions: Reduces computational cost
Inverted Residuals: Efficient feature extraction
Linear Bottlenecks: Maintains representational power
ReLU6 Activation: Optimized for mobile hardware

🌟 Key Advantages

vs. Heavy Models (ResNet, EfficientNet)

✅ Faster inference (optimized for mobile)
✅ Smaller memory footprint
✅ Better battery efficiency
✅ Edge deployment ready

vs. Other Demos

✅ Whisper-style interface (familiar UX)
✅ Auto-classification (no manual buttons needed)
✅ Clean, modern design
✅ Mobile-responsive

📱 Mobile Deployment

This model is specifically designed for mobile deployment:

# Example mobile optimization
model = MobileNetV2ForImageClassification.from_pretrained(
    "google/mobilenet_v2_1.0_224",
    torch_dtype=torch.float16,  # Half precision
    low_cpu_mem_usage=True      # Memory optimization
)

🎨 Customization

Adding New Examples

example_urls = {
    "Your Category": "https://your-image-url.com/image.jpg",
    # Add more examples here
}

Adjusting UI Theme

theme = gr.themes.Soft()  # Options: Soft, Default, Monochrome

Changing Model

MODEL_NAME = "google/mobilenet_v2_1.4_224"  # Larger variant
MODEL_NAME = "google/mobilenet_v2_0.75_224"  # Smaller variant

🔄 Image Processing Pipeline

Input: User uploads image (any format/size)
Preprocessing: Resize to 224×224, normalize
Inference: MobileNetV2 forward pass
Postprocessing: Apply softmax, get top-5
Output: Formatted predictions with confidence

🚀 Performance

Inference Speed: ~50ms on CPU, ~10ms on GPU
Memory Usage: ~200MB RAM
Model Size: ~14MB
Throughput: 20+ images/second on modern hardware

📚 Example Classes

The model can classify 1,000 ImageNet categories including:

Animals: Dogs, cats, birds, wildlife
Vehicles: Cars, trucks, motorcycles, aircraft
Objects: Furniture, electronics, tools
Food: Fruits, vegetables, dishes
Nature: Plants, landscapes, weather

🤝 Contributing

Fork the repository
Create a feature branch
Make your improvements
Submit a pull request

📄 License

This project is open source and available under the MIT License.

🙏 Acknowledgments

Google Research: For MobileNetV2 architecture
Hugging Face: For the Transformers library and model hosting
Gradio Team: For the amazing web interface framework
ImageNet: For the comprehensive dataset

🔗 Links

Built with ❤️ using Hugging Face Transformers and Gradio