Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available: 6.15.2
metadata
title: Formula Engine Chatbot
emoji: 🧮
colorFrom: purple
colorTo: blue
sdk: gradio
sdk_version: 5.32.0
app_file: app.py
pinned: false
license: apache-2.0
tags:
- ml-intern
🧮 Formula Engine Chatbot
AI-Powered Weight Compression: Qwen 0.5B from Mathematical Formulas
This Space demonstrates the Formula Engine concept:
- Instead of storing the full 942 MB Qwen 0.5B model
- We store compact mathematical formulas (~474 MB) that reconstruct the weights on-the-fly
- 49.7% disk space saved while maintaining 99.99% accuracy
How it works:
- Formula Discovery: Analyzes each weight matrix to find the most compact representation
- 4-bit Quantization:
W ≈ scale × W_q + zero_point(75% compression per layer) - SVD Factorization:
W ≈ U_r × S_r × V_r^T(for rectangular matrices) - On-the-fly Reconstruction: Formulas regenerate weights at startup
Links:
Note
This Space requires ~2GB RAM to reconstruct and run the model. It uses CPU inference.