fone / README.md
nathfavour's picture
Update README.md
a57c7d7 verified
|
Raw
History Blame Contribute Delete
3.43 kB
---
title: Fone
emoji: 🐠
colorFrom: red
colorTo: blue
sdk: gradio
sdk_version: 6.18.0
python_version: '3.12'
app_file: app.py
pinned: false
license: agpl-3.0
short_description: automated Voice-to-Architecture pipeline that transforms raw
tags:
- track:backyard
- track:wood
---
# fone // Voice Architecture Pipeline
**fone** is a decentralized Voice-to-Architecture pipeline built for the **Build Small Hackathon**. It transforms fragmented, erratic spoken voice notes from developers directly into structured system design specifications, task allocation matrices, and isolated code/schema payloads.
By replacing open-ended conversation with a rigid, deterministic parsing workflow, it acts as a local-first engineering console optimized for technical execution.
---
## πŸ› οΈ The Under-32B Model Stack
This project achieves a lightweight footprint by chaining two specialized open-weights models from Cohere Labs. Each model remains far below the hackathon's parameter ceiling, delivering fast local-first or serverless inference.
| Layer | Model Identifier | Parameter Count | Core Responsibility |
| --- | --- | --- | --- |
| **Audio Processing** | `CohereLabs/cohere-transcribe-03-2026` | **2 Billion** | Blazing-fast speech-to-text processing with manual linguistic targets and strict punctuation boundaries. |
| **Linguistic Generation** | `CohereLabs/tiny-aya-earth` | **3.35 Billion** | Regional variant optimized for West Asian and African language structures. Ingests raw text to map out structural layouts. |
* **Total Combined Parameter Footprint:** 5.35B parameters.
---
## πŸŽ›οΈ Key Features
* **Voice-to-Spec Routing:** Bypasses conversational filler to map audio context directly into structured Markdown documents and executable code blocks using zero-temperature parameters.
* **Orchestration Hub Layout:** Built completely outside standard chat box wrappers using Gradio 6 `gr.Blocks`. It categorizes multi-layered data arrays into three distinct runtime views:
* **System Summary:** A comprehensive evaluation of the spoken feature set or database requirement.
* **Task Allocation Matrix:** A structured `gr.Dataframe` tracking specific objectives, priorities, and implementation contexts.
* **Code & Schema Artifacts:** A clean code interface showcasing extracted system configurations or setup scripts.
* **Sovereign Dark Aesthetic:** Completely customized using a clean, high-contrast monochrome design wrapper to match developer console guidelines.
---
## πŸš€ Technical Configuration
The application runtime environment leverages an isolated serverless API workflow to map incoming requests to Hugging Face infrastructure without local hardware dependencies.
### Dependencies (`requirements.txt`)
```text
gradio
huggingface_hub
```
### Core Pipeline Execution Loop
1. **Audio Capture:** Ingests raw `.wav` or `.mp3` payloads directly through the web interface.
2. **Context Alignment:** Passes the audio segment to the 2B transcription node.
3. **Structured Structuring:** Feeds the resulting text transcript into `tiny-aya-earth`, enforcing a rigid JSON output format through systemic system prompting.
4. **Deterministic Split:** Parses the JSON block through Python regex validations to isolate documentation fields from code files, writing them to disk for direct download.
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference