Spaces:
Runtime error
Runtime error
| title: Fone | |
| emoji: π | |
| colorFrom: red | |
| colorTo: blue | |
| sdk: gradio | |
| sdk_version: 6.18.0 | |
| python_version: '3.12' | |
| app_file: app.py | |
| pinned: false | |
| license: agpl-3.0 | |
| short_description: automated Voice-to-Architecture pipeline that transforms raw | |
| tags: | |
| - track:backyard | |
| - track:wood | |
| # fone // Voice Architecture Pipeline | |
| **fone** is a decentralized Voice-to-Architecture pipeline built for the **Build Small Hackathon**. It transforms fragmented, erratic spoken voice notes from developers directly into structured system design specifications, task allocation matrices, and isolated code/schema payloads. | |
| By replacing open-ended conversation with a rigid, deterministic parsing workflow, it acts as a local-first engineering console optimized for technical execution. | |
| --- | |
| ## π οΈ The Under-32B Model Stack | |
| This project achieves a lightweight footprint by chaining two specialized open-weights models from Cohere Labs. Each model remains far below the hackathon's parameter ceiling, delivering fast local-first or serverless inference. | |
| | Layer | Model Identifier | Parameter Count | Core Responsibility | | |
| | --- | --- | --- | --- | | |
| | **Audio Processing** | `CohereLabs/cohere-transcribe-03-2026` | **2 Billion** | Blazing-fast speech-to-text processing with manual linguistic targets and strict punctuation boundaries. | | |
| | **Linguistic Generation** | `CohereLabs/tiny-aya-earth` | **3.35 Billion** | Regional variant optimized for West Asian and African language structures. Ingests raw text to map out structural layouts. | | |
| * **Total Combined Parameter Footprint:** 5.35B parameters. | |
| --- | |
| ## ποΈ Key Features | |
| * **Voice-to-Spec Routing:** Bypasses conversational filler to map audio context directly into structured Markdown documents and executable code blocks using zero-temperature parameters. | |
| * **Orchestration Hub Layout:** Built completely outside standard chat box wrappers using Gradio 6 `gr.Blocks`. It categorizes multi-layered data arrays into three distinct runtime views: | |
| * **System Summary:** A comprehensive evaluation of the spoken feature set or database requirement. | |
| * **Task Allocation Matrix:** A structured `gr.Dataframe` tracking specific objectives, priorities, and implementation contexts. | |
| * **Code & Schema Artifacts:** A clean code interface showcasing extracted system configurations or setup scripts. | |
| * **Sovereign Dark Aesthetic:** Completely customized using a clean, high-contrast monochrome design wrapper to match developer console guidelines. | |
| --- | |
| ## π Technical Configuration | |
| The application runtime environment leverages an isolated serverless API workflow to map incoming requests to Hugging Face infrastructure without local hardware dependencies. | |
| ### Dependencies (`requirements.txt`) | |
| ```text | |
| gradio | |
| huggingface_hub | |
| ``` | |
| ### Core Pipeline Execution Loop | |
| 1. **Audio Capture:** Ingests raw `.wav` or `.mp3` payloads directly through the web interface. | |
| 2. **Context Alignment:** Passes the audio segment to the 2B transcription node. | |
| 3. **Structured Structuring:** Feeds the resulting text transcript into `tiny-aya-earth`, enforcing a rigid JSON output format through systemic system prompting. | |
| 4. **Deterministic Split:** Parses the JSON block through Python regex validations to isolate documentation fields from code files, writing them to disk for direct download. | |
| Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference | |