fone / README.md
nathfavour's picture
Update README.md
a57c7d7 verified
|
Raw
History Blame Contribute Delete
3.43 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade
metadata
title: Fone
emoji: 🐠
colorFrom: red
colorTo: blue
sdk: gradio
sdk_version: 6.18.0
python_version: '3.12'
app_file: app.py
pinned: false
license: agpl-3.0
short_description: automated Voice-to-Architecture pipeline that transforms raw
tags:
  - track:backyard
  - track:wood

fone // Voice Architecture Pipeline

fone is a decentralized Voice-to-Architecture pipeline built for the Build Small Hackathon. It transforms fragmented, erratic spoken voice notes from developers directly into structured system design specifications, task allocation matrices, and isolated code/schema payloads.

By replacing open-ended conversation with a rigid, deterministic parsing workflow, it acts as a local-first engineering console optimized for technical execution.


πŸ› οΈ The Under-32B Model Stack

This project achieves a lightweight footprint by chaining two specialized open-weights models from Cohere Labs. Each model remains far below the hackathon's parameter ceiling, delivering fast local-first or serverless inference.

Layer Model Identifier Parameter Count Core Responsibility
Audio Processing CohereLabs/cohere-transcribe-03-2026 2 Billion Blazing-fast speech-to-text processing with manual linguistic targets and strict punctuation boundaries.
Linguistic Generation CohereLabs/tiny-aya-earth 3.35 Billion Regional variant optimized for West Asian and African language structures. Ingests raw text to map out structural layouts.
  • Total Combined Parameter Footprint: 5.35B parameters.

πŸŽ›οΈ Key Features

  • Voice-to-Spec Routing: Bypasses conversational filler to map audio context directly into structured Markdown documents and executable code blocks using zero-temperature parameters.

  • Orchestration Hub Layout: Built completely outside standard chat box wrappers using Gradio 6 gr.Blocks. It categorizes multi-layered data arrays into three distinct runtime views:

  • System Summary: A comprehensive evaluation of the spoken feature set or database requirement.

  • Task Allocation Matrix: A structured gr.Dataframe tracking specific objectives, priorities, and implementation contexts.

  • Code & Schema Artifacts: A clean code interface showcasing extracted system configurations or setup scripts.

  • Sovereign Dark Aesthetic: Completely customized using a clean, high-contrast monochrome design wrapper to match developer console guidelines.


πŸš€ Technical Configuration

The application runtime environment leverages an isolated serverless API workflow to map incoming requests to Hugging Face infrastructure without local hardware dependencies.

Dependencies (requirements.txt)

gradio
huggingface_hub

Core Pipeline Execution Loop

  1. Audio Capture: Ingests raw .wav or .mp3 payloads directly through the web interface.
  2. Context Alignment: Passes the audio segment to the 2B transcription node.
  3. Structured Structuring: Feeds the resulting text transcript into tiny-aya-earth, enforcing a rigid JSON output format through systemic system prompting.
  4. Deterministic Split: Parses the JSON block through Python regex validations to isolate documentation fields from code files, writing them to disk for direct download.

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference