---
title: Samantic Audio
emoji: 🤖
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 6.18.0
python_version: '3.10'
app_file: app.py
pinned: false
tags:
  - track:wood
  - sponsor:openbmb
  - sponsor:openai
  - achievement:offgrid
  - bonus:tiny-titan
---

# Samantic Audio

Semantic-to-audio communication system that features a local robotic AI. 

The application uses an LLM to generate responses, converts the response to speech using bark-small TTS, and processes the output through a custom synthesizer to create a unique "droid-style" voice.

## Links

| Resource | Link |
| :--- | :--- |
| **Demo Video** | [YouTube](https://youtu.be/ID0IG2BIBRo) |
| **Social Post** | [X Post](https://x.com/GovindLokam/status/2066625722891026578?s=20) |
| **Social Post** | [LinkedIn Post](https://www.linkedin.com/posts/govind-lokam-335382230_ai-llm-generativeai-share-7472390184009895936-6YOz/?utm_source=share&utm_medium=member_desktop&rcm=ACoAADm4O1kBjwPZCkmLC-YSFR8At-SNQhkj4XY) |
| **GitHub Repository** | [GovIndLok/Voinal](https://github.com/GovIndLok/Voinal) |

## Hackathon Badges Claimed

| Category | Badge Name (Tag) | Description / Justification |
| :--- | :--- | :--- |
| **Track** | Thousand Token Wood (`track:wood`) | Whimsical, delightful, AI-native app |
| **Sponsor** | MiniCPM Build (`sponsor:openbmb`) | Used MiniCPM5-1B for LLM |
| **Sponsor** | Codex (`sponsor:openai`) | Codex-attributed commits in the connected GitHub repo |
| **Achievement** | Off the Grid (`achievement:offgrid`) | Both models run in-Space on ZeroGPU; no external AI API is called. And can also be run locally as shown even in demo |
| **Bonus** | Tiny-titan (`bonus:tiny-titan`) | Models must be ≤ 4B parameters. (1B (minicpm5) + ~240M (bark-small) <= 4B) |

## Features
- **Local LLM Integration**: Uses `MiniCPM5-1B` to generate responses locally.
- **Text-to-Speech**: Uses the `bark-small` TTS model for voice generation.
- **Custom Audio Synthesis**: The `synth.py` pipeline transforms audio into a droid-style output.
- **Web Interface**: A Gradio web UI to easily chat with the robot.

## Requirements
- Python >= 3.12

## Setup & Installation
Activate your virtual environment and install the required packages:

```bash
source .venv/bin/activate
pip install -r pyproject.toml # or install dependencies manually: gradio numpy scipy torch soundfile
```

## Running the Application
To start the Gradio UI locally:

```bash
python app.py
```

The application will be served at `http://127.0.0.1:7860`.

## Project Structure
- `app.py`: Gradio entrypoint serving the UI and handling the full pipeline.
- `synth.py`: Audio-processing pipeline to transform standard WAVs into droid-style output.
- `tts_model.py`: `bark-small` TTS integration.
- `pyproject.toml`: Project metadata and dependencies.