Spaces:
Build error
Build error
A newer version of the Gradio SDK is available: 6.12.0
metadata
title: Person Classification Demo
emoji: π±
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
π± MobileNetV2 Image Classification Demo
A lightweight, interactive image classification demo built with Hugging Face Transformers and Gradio. This demo replicates the clean interface style of popular Whisper demos but for computer vision classification tasks.
β¨ Features
- Mobile-Optimized Model: Uses MobileNetV2, designed specifically for efficient mobile and edge deployment
- Interactive Web Interface: Clean, modern Gradio interface similar to Whisper demos
- Real-time Classification: Instant image classification with top-5 predictions
- 1000 ImageNet Classes: Recognizes a wide variety of objects, animals, vehicles, and scenes
- Confidence Scores: Shows prediction confidence as percentages
- Example Images: Pre-loaded example images for quick testing
- Responsive Design: Works seamlessly on desktop and mobile devices
- Lightweight: Only 3.4M parameters for fast inference
π Quick Start
Option 1: Hugging Face Spaces (Recommended)
Deploy instantly to Hugging Face Spaces:
- Create a new Space on Hugging Face Spaces
- Upload these files:
app.py,requirements.txt,README.md - Your demo will be live automatically!
Option 2: Local Development
# Clone this repository
git clone <your-repo-url>
cd mobilenetv2-classification-demo
# Install dependencies
pip install -r requirements.txt
# Run the demo
python app.py
The demo will be available at http://localhost:7860
π― How to Use
- Upload Image: Click on the upload area or drag and drop an image
- Get Results: Classification happens automatically, showing top-5 predictions
- Try Examples: Use the example buttons to test with sample images
- View Confidence: Each prediction shows a confidence percentage
π Model Information
- Model:
google/mobilenet_v2_1.0_224 - Architecture: MobileNetV2 with 1.0 width multiplier
- Input Size: 224Γ224 pixels
- Parameters: 3.4 million (lightweight!)
- Classes: 1,000 ImageNet categories
- Optimization: Designed for mobile and edge devices
π§ Technical Details
Dependencies
- PyTorch: Deep learning framework
- Transformers: Hugging Face model library
- Gradio: Web interface framework
- Pillow: Image processing
- NumPy: Numerical computing
Model Architecture
MobileNetV2 uses:
- Depthwise Separable Convolutions: Reduces computational cost
- Inverted Residuals: Efficient feature extraction
- Linear Bottlenecks: Maintains representational power
- ReLU6 Activation: Optimized for mobile hardware
π Key Advantages
vs. Heavy Models (ResNet, EfficientNet)
- β Faster inference (optimized for mobile)
- β Smaller memory footprint
- β Better battery efficiency
- β Edge deployment ready
vs. Other Demos
- β Whisper-style interface (familiar UX)
- β Auto-classification (no manual buttons needed)
- β Clean, modern design
- β Mobile-responsive
π± Mobile Deployment
This model is specifically designed for mobile deployment:
# Example mobile optimization
model = MobileNetV2ForImageClassification.from_pretrained(
"google/mobilenet_v2_1.0_224",
torch_dtype=torch.float16, # Half precision
low_cpu_mem_usage=True # Memory optimization
)
π¨ Customization
Adding New Examples
example_urls = {
"Your Category": "https://your-image-url.com/image.jpg",
# Add more examples here
}
Adjusting UI Theme
theme = gr.themes.Soft() # Options: Soft, Default, Monochrome
Changing Model
MODEL_NAME = "google/mobilenet_v2_1.4_224" # Larger variant
MODEL_NAME = "google/mobilenet_v2_0.75_224" # Smaller variant
π Image Processing Pipeline
- Input: User uploads image (any format/size)
- Preprocessing: Resize to 224Γ224, normalize
- Inference: MobileNetV2 forward pass
- Postprocessing: Apply softmax, get top-5
- Output: Formatted predictions with confidence
π Performance
- Inference Speed: ~50ms on CPU, ~10ms on GPU
- Memory Usage: ~200MB RAM
- Model Size: ~14MB
- Throughput: 20+ images/second on modern hardware
π Example Classes
The model can classify 1,000 ImageNet categories including:
- Animals: Dogs, cats, birds, wildlife
- Vehicles: Cars, trucks, motorcycles, aircraft
- Objects: Furniture, electronics, tools
- Food: Fruits, vegetables, dishes
- Nature: Plants, landscapes, weather
π€ Contributing
- Fork the repository
- Create a feature branch
- Make your improvements
- Submit a pull request
π License
This project is open source and available under the MIT License.
π Acknowledgments
- Google Research: For MobileNetV2 architecture
- Hugging Face: For the Transformers library and model hosting
- Gradio Team: For the amazing web interface framework
- ImageNet: For the comprehensive dataset
π Links
Built with β€οΈ using Hugging Face Transformers and Gradio