hackathon-advisor / data /quest_labels /in /code-shrink-token-decimator.json
JacobLinCool's picture
feat: add live project atlas
4791c0a verified
{
"id": "build-small-hackathon/code-shrink-token-decimator",
"slug": "code-shrink-token-decimator",
"title": "Code Shrink Token Decimator",
"sdk": "gradio",
"declared_models": [],
"tags": [
"gradio",
"region:us"
],
"app_file": "app.py",
"README": "# ⚡ Code-Shrink: Token-Decimator v1.0 > **An Ultra-Lightweight Computational Utility Built to Eliminate LLM Context Bloat Natively on the Edge Container.** > *Submitted for the Hugging Face Build Small Hackathon (Track 2: Performance & Efficiency Optimization).* --- ## 📽️ Project Demonstration & Walkthrough Check out the full workflow, speed metrics, and feature breakdown in action here: 🔗 **[Watch the Live Demo on TikTok](https://www.tiktok.com/@salarai123/video/7648566501598940436)** --- ## 🔍 The Problem & The Solution ### The Bottleneck: LLM Context Inflation Modern production applications relying on Large Language Model (LLM) APIs suffer from massive financial overhead. Upstream providers charge by the token—meaning heavy indentation loops, generic code comments, raw text formatting, and large structural blocks exponentially inflate infrastructure bills. ### The Engine: Code-Shrink **Code-Shrink v1.0** passes raw context inputs through an edge-computed Abstract Syntax Tree (AST) framework. Instead of hosting gigabytes of neural network weights that lag and crash free hosting tiers, this application runs entirely on zero-cost, lightweight lexical optimization models. It reduces prompt token sizes by **up to 66% in under 10 milliseconds**. --- ## ⚡ Technical Core Features * **Abstract Syntax Tree (AST) De-bloating:** Fully parses Python/R/SQL environments natively to structurally strip docstrings, developer comments, and empty lines while maintaining 100% semantic code inte ...",
"APP_FILE": "import sys\nimport types\nimport ast\nimport re\nimport json\n\n# 🚨 DYNAMIC FIX: Python 3.13 Compatibility Core Patches\nif 'audioop' not in sys.modules:\n dummy_audioop = types.ModuleType('audioop')\n dummy_audioop.error = Exception\n sys.modules['audioop'] = dummy_audioop\n\nif 'pyaudioop' not in sys.modules:\n dummy_pyaudioop = types.ModuleType('pyaudioop')\n dummy_pyaudioop.error = Exception\n sys.modules['pyaudioop'] = dummy_pyaudioop\n\ntry:\n import huggingface_hub\nexcept ImportError:\n huggingface_hub = types.ModuleType('huggingface_hub')\n sys.modules['huggingface_hub'] = huggingface_hub\n\nif not hasattr(huggingface_hub, 'HfFolder'):\n class DummyHfFolder:\n @staticmethod\n def get_token(): return None\n @staticmethod\n def save_token(token): pass\n @staticmethod\n def delete_token(): pass\n huggingface_hub.HfFolder = DummyHfFolder\n\nimport gradio as gr\n\ndef estimate_tokens(text):\n \"\"\"Ultra-fast local token estimator (Roughly 1 token = 4 chars for code/text setup)\"\"\"\n if not text:\n return 0\n return max(1, len(text) // 4 + text.count(' ') // 2)\n\ndef shrink_python_code(source_code):\n \"\"\"Parses and strips syntax trees to remove bloat tokens natively\"\"\"\n try:\n tree = ast.parse(source_code)\n for node in ast.walk(tree):\n if isinstance(node, (ast.FunctionDef, ast.ClassDef, ast.Module)):\n if node.body and isinstance(node.body[0], ast.Expr) and isinstance(node.body[0].value, ast.Constant) and isinstance(node.body[0].value.value, str):\n node.body.pop(0)\n \n clean_code = ast.unparse(tree)\n clean_code = re.sub(r'\\n\\s*\\n', '\\n', clean_code)\n return clean_code.strip()\n except Exception:\n code = re.sub(r'#.*', '', source_code)\n code = re.sub(r'\\\"\\\"\\\"[\\s\\S]*?\\\"\\\"\\\"', '', code)\n code = re.s ..."
}