Spaces:
Running on Zero
Running on Zero
A newer version of the Gradio SDK is available: 6.19.0
metadata
title: Podify - AI Podcast Generator
emoji: ποΈ
colorFrom: indigo
colorTo: purple
sdk: gradio
sdk_version: 5.27.0
app_file: app.py
python_version: '3.10'
hardware: zero-a10g
suggested_hardware: zero-a10g
pinned: false
short_description: Research a topic and turn it into a voiced podcast
tags:
- track:backyard
- achievement:offgrid
- achievement:offbrand
- achievement:fieldnotes
ποΈ Podify β AI Podcast Generator
Turn any topic into a finished, voiced podcast in two phases:
- Content β research agents (LangGraph) use a HuggingFace-hosted LLM plus live DuckDuckGo web search to research the topic and write a speaker-tagged script.
- Audio β the self-hosted Fish Audio / OpenAudio S1-mini model speaks the script, with selectable preset voices and zero-shot voice cloning from an uploaded clip or a live mic recording.
Everything runs inside this single Gradio Space; the TTS model runs on ZeroGPU.
Architecture
Topic ββΆ LangGraph: plan ββΆ DDG search ββΆ outline ββΆ write ββΆ Script
Script ββΆ Fish Audio S1-mini (@spaces.GPU): per-line synth ββΆ stitched podcast WAV
app.pyβ Gradio Blocks UI (two tabs) wiring both phases.research/βllm.py(HF Inference client),search.py(DuckDuckGo),graph.py(LangGraph research graph).tts/βengine.py(model load + GPU synthesis + multi-speaker stitching),voices.py(preset voice registry).
Configuration
Set these as Space secrets / variables (Settings β Variables and secrets):
| Name | Required | Purpose |
|---|---|---|
HF_TOKEN |
β | LLM inference (Inference Providers) + model download. |
LLM_MODEL |
optional | Override the content LLM (default Qwen/Qwen2.5-14B-Instruct, <32B). |
TTS_MODEL_REPO |
optional | Override the TTS model repo (default fishaudio/openaudio-s1-mini). |
ZeroGPU requires a HuggingFace PRO account on the Space owner.
Run locally
pip install -r requirements.txt
export HF_TOKEN=hf_xxx # PowerShell: $env:HF_TOKEN="hf_xxx"
python app.py
Phase 1 (research + script) runs on CPU. Phase 2 (TTS) needs a GPU and the
fish-speech package; on CPU-only machines the UI loads but synthesis is disabled.
Models Used
- Qwen/Qwen2.5-7B-Instruct For Research and Script Generation
- fishaudio/openaudio-s1-mini 0.5b For Audio Generation
Deploy to a Space
huggingface-cli login
huggingface-cli upload <user>/podify . --repo-type=space
# or: git push to the Space remote (preset .wav files tracked via Git LFS)
Credits / assets
- Voice samples (
tts/voices/): derived from CMU ARCTIC (free for research and commercial use). Rebuild withscripts/build_voice_samples.py. - Background-music loops (
tts/music_loops/): FreePD by Kevin MacLeod β 100% public domain (CC0). Rebuild withscripts/build_music_loops.py. A procedural numpy fallback intts/music.pyis used if the loops are absent.
Contributots
- nvipin63
- jayaspjacob
#backyard-ai
