Spaces:
Sleeping
Sleeping
| title: CurvOpt SmarterModels | |
| emoji: 📊 | |
| colorFrom: red | |
| colorTo: red | |
| sdk: gradio | |
| sdk_version: 6.6.0 | |
| app_file: app.py | |
| pinned: false | |
| license: apache-2.0 | |
| short_description: Smarter Models, Smaller Footprint | |
| # CurvOpt-LLM — Realtime Optimizer | |
| **Curvature-guided mixed-precision optimization for LLMs. No retraining required.** | |
| ## What This Does | |
| - Loads any HuggingFace causal LM | |
| - Computes Fisher diagonal curvature per layer (real gradients) | |
| - Assigns FP32 / FP16 / BF16 per layer based on sensitivity | |
| - Rewrites and saves a deployable optimized model (downloadable ZIP) | |
| - Reports electricity, CO₂, and water footprint savings | |
| ## How to Use | |
| 1. Select a model from the dropdown (or enter a custom HF model ID) | |
| 2. Set calibration samples (1–32) and PPL tolerance | |
| 3. Click **Run Optimization** | |
| 4. Download the optimized model ZIP when done | |
| ## Supported Models | |
| OPT family · GPT-2 family · Pythia · Phi · BLOOM · Mistral · Llama-2 · Qwen · Falcon · and any `AutoModelForCausalLM` compatible model. | |
| ## Research | |
| Based on Fisher Information / Optimal Brain Damage curvature analysis. | |
| Novel contribution: per-request curvature-gated mixed precision with user intent feedback. | |
| Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference | |