File size: 1,292 Bytes
ca4d7a8
 
 
 
 
 
 
 
 
 
 
 
5f31bde
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ca4d7a8
 
5f31bde
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
---
title: CurvOpt SmarterModels
emoji: 📊
colorFrom: red
colorTo: red
sdk: gradio
sdk_version: 6.6.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: Smarter Models, Smaller Footprint
---
# CurvOpt-LLM — Realtime Optimizer

**Curvature-guided mixed-precision optimization for LLMs. No retraining required.**

## What This Does
- Loads any HuggingFace causal LM
- Computes Fisher diagonal curvature per layer (real gradients)
- Assigns FP32 / FP16 / BF16 per layer based on sensitivity
- Rewrites and saves a deployable optimized model (downloadable ZIP)
- Reports electricity, CO₂, and water footprint savings

## How to Use
1. Select a model from the dropdown (or enter a custom HF model ID)
2. Set calibration samples (1–32) and PPL tolerance
3. Click **Run Optimization**
4. Download the optimized model ZIP when done

## Supported Models
OPT family · GPT-2 family · Pythia · Phi · BLOOM · Mistral · Llama-2 · Qwen · Falcon · and any `AutoModelForCausalLM` compatible model.

## Research
Based on Fisher Information / Optimal Brain Damage curvature analysis.
Novel contribution: per-request curvature-gated mixed precision with user intent feedback.

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference