Spaces:
Running on Zero
Running on Zero
GovIndLok
feat: update TTS to bark-small and other updates in associated project documentation, added links required for submission
2864ffb A newer version of the Gradio SDK is available: 6.19.0
metadata
title: Samantic Audio
emoji: 🤖
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 6.18.0
python_version: '3.10'
app_file: app.py
pinned: false
tags:
- track:wood
- sponsor:openbmb
- sponsor:openai
- achievement:offgrid
- bonus:tiny-titan
Samantic Audio
Semantic-to-audio communication system that features a local robotic AI.
The application uses an LLM to generate responses, converts the response to speech using bark-small TTS, and processes the output through a custom synthesizer to create a unique "droid-style" voice.
Links
| Resource | Link |
|---|---|
| Demo Video | YouTube |
| Social Post | X Post |
| Social Post | LinkedIn Post |
| GitHub Repository | GovIndLok/Voinal |
Hackathon Badges Claimed
| Category | Badge Name (Tag) | Description / Justification |
|---|---|---|
| Track | Thousand Token Wood (track:wood) |
Whimsical, delightful, AI-native app |
| Sponsor | MiniCPM Build (sponsor:openbmb) |
Used MiniCPM5-1B for LLM |
| Sponsor | Codex (sponsor:openai) |
Codex-attributed commits in the connected GitHub repo |
| Achievement | Off the Grid (achievement:offgrid) |
Both models run in-Space on ZeroGPU; no external AI API is called. And can also be run locally as shown even in demo |
| Bonus | Tiny-titan (bonus:tiny-titan) |
Models must be ≤ 4B parameters. (1B (minicpm5) + ~240M (bark-small) <= 4B) |
Features
- Local LLM Integration: Uses
MiniCPM5-1Bto generate responses locally. - Text-to-Speech: Uses the
bark-smallTTS model for voice generation. - Custom Audio Synthesis: The
synth.pypipeline transforms audio into a droid-style output. - Web Interface: A Gradio web UI to easily chat with the robot.
Requirements
- Python >= 3.12
Setup & Installation
Activate your virtual environment and install the required packages:
source .venv/bin/activate
pip install -r pyproject.toml # or install dependencies manually: gradio numpy scipy torch soundfile
Running the Application
To start the Gradio UI locally:
python app.py
The application will be served at http://127.0.0.1:7860.
Project Structure
app.py: Gradio entrypoint serving the UI and handling the full pipeline.synth.py: Audio-processing pipeline to transform standard WAVs into droid-style output.tts_model.py:bark-smallTTS integration.pyproject.toml: Project metadata and dependencies.