--- title: Fone emoji: 🐠 colorFrom: red colorTo: blue sdk: gradio sdk_version: 6.18.0 python_version: '3.12' app_file: app.py pinned: false license: agpl-3.0 short_description: automated Voice-to-Architecture pipeline that transforms raw tags: - track:backyard - track:wood --- # fone // Voice Architecture Pipeline **fone** is a decentralized Voice-to-Architecture pipeline built for the **Build Small Hackathon**. It transforms fragmented, erratic spoken voice notes from developers directly into structured system design specifications, task allocation matrices, and isolated code/schema payloads. By replacing open-ended conversation with a rigid, deterministic parsing workflow, it acts as a local-first engineering console optimized for technical execution. --- ## 🛠️ The Under-32B Model Stack This project achieves a lightweight footprint by chaining two specialized open-weights models from Cohere Labs. Each model remains far below the hackathon's parameter ceiling, delivering fast local-first or serverless inference. | Layer | Model Identifier | Parameter Count | Core Responsibility | | --- | --- | --- | --- | | **Audio Processing** | `CohereLabs/cohere-transcribe-03-2026` | **2 Billion** | Blazing-fast speech-to-text processing with manual linguistic targets and strict punctuation boundaries. | | **Linguistic Generation** | `CohereLabs/tiny-aya-earth` | **3.35 Billion** | Regional variant optimized for West Asian and African language structures. Ingests raw text to map out structural layouts. | * **Total Combined Parameter Footprint:** 5.35B parameters. --- ## 🎛️ Key Features * **Voice-to-Spec Routing:** Bypasses conversational filler to map audio context directly into structured Markdown documents and executable code blocks using zero-temperature parameters. * **Orchestration Hub Layout:** Built completely outside standard chat box wrappers using Gradio 6 `gr.Blocks`. It categorizes multi-layered data arrays into three distinct runtime views: * **System Summary:** A comprehensive evaluation of the spoken feature set or database requirement. * **Task Allocation Matrix:** A structured `gr.Dataframe` tracking specific objectives, priorities, and implementation contexts. * **Code & Schema Artifacts:** A clean code interface showcasing extracted system configurations or setup scripts. * **Sovereign Dark Aesthetic:** Completely customized using a clean, high-contrast monochrome design wrapper to match developer console guidelines. --- ## 🚀 Technical Configuration The application runtime environment leverages an isolated serverless API workflow to map incoming requests to Hugging Face infrastructure without local hardware dependencies. ### Dependencies (`requirements.txt`) ```text gradio huggingface_hub ``` ### Core Pipeline Execution Loop 1. **Audio Capture:** Ingests raw `.wav` or `.mp3` payloads directly through the web interface. 2. **Context Alignment:** Passes the audio segment to the 2B transcription node. 3. **Structured Structuring:** Feeds the resulting text transcript into `tiny-aya-earth`, enforcing a rigid JSON output format through systemic system prompting. 4. **Deterministic Split:** Parses the JSON block through Python regex validations to isolate documentation fields from code files, writing them to disk for direct download. Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference