metadata
title: Neural Network Quantizer
emoji: ⚡
colorFrom: indigo
colorTo: purple
sdk: docker
pinned: false
license: mit
app_port: 7860
Neural Network Weight Quantizer
Quantize neural network weights to lower precision formats (INT8, INT4, NF4) with interactive visualizations.
Features
- 🔢 Multi-bit quantization (4-bit, 8-bit)
- 📊 Interactive weight visualizations
- 🤗 HuggingFace model support (optional)
- ⚡ GPU acceleration (when available)
- 📈 Quantization error analysis
- 🔄 Method comparison (INT8 vs INT4 vs NF4)
Quick Start
- Use the Quantizer tab to test on random weights
- Compare different methods in the Analysis tab
- Optionally load a HuggingFace model in the Models tab
API
The backend exposes a REST API at /api:
GET /api/system/info- System capabilitiesPOST /api/quantize/weights- Quantize custom weightsPOST /api/models/load- Load HuggingFace modelPOST /api/analysis/compare- Compare methods
🚀 Deployment
Hugging Face Spaces
This project is configured for Hugging Face Spaces using the Docker SDK.
- Create a new Space on Hugging Face.
- Select Docker as the SDK.
- Push this repository to your Space:
git remote add space https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME git push space main
Docker
Run locally with Docker:
docker build -t quantizer .
docker run -p 7860:7860 quantizer
Open http://localhost:7860.