Spaces:

build-small-hackathon
/

podify

Running on Zero

App Files Files Community

podify / README.md

jayaspjacob

Update README.md

5878205 verified 16 days ago

preview code

Raw

History Blame Contribute Delete

3.59 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade

metadata

title: Podify - AI Podcast Generator
emoji: 🎙️
colorFrom: indigo
colorTo: purple
sdk: gradio
sdk_version: 5.27.0
app_file: app.py
python_version: '3.10'
hardware: zero-a10g
suggested_hardware: zero-a10g
pinned: false
short_description: Research a topic and turn it into a voiced podcast
tags:
  - track:backyard
  - achievement:offgrid
  - achievement:offbrand
  - achievement:fieldnotes

🎙️ Podify — AI Podcast Generator

Turn any topic into a finished, voiced podcast in two phases:

Content — research agents (LangGraph) use a HuggingFace-hosted LLM plus live DuckDuckGo web search to research the topic and write a speaker-tagged script.
Audio — the self-hosted Fish Audio / OpenAudio S1-mini model speaks the script, with selectable preset voices and zero-shot voice cloning from an uploaded clip or a live mic recording.

Everything runs inside this single Gradio Space; the TTS model runs on ZeroGPU.

Architecture

Topic ─▶ LangGraph: plan ─▶ DDG search ─▶ outline ─▶ write ─▶ Script
Script ─▶ Fish Audio S1-mini (@spaces.GPU): per-line synth ─▶ stitched podcast WAV

app.py — Gradio Blocks UI (two tabs) wiring both phases.
research/ — llm.py (HF Inference client), search.py (DuckDuckGo), graph.py (LangGraph research graph).
tts/ — engine.py (model load + GPU synthesis + multi-speaker stitching), voices.py (preset voice registry).

Configuration

Set these as Space secrets / variables (Settings → Variables and secrets):

Name	Required	Purpose
`HF_TOKEN`	✅	LLM inference (Inference Providers) + model download.
`LLM_MODEL`	optional	Override the content LLM (default `Qwen/Qwen2.5-14B-Instruct`, <32B).
`TTS_MODEL_REPO`	optional	Override the TTS model repo (default `fishaudio/openaudio-s1-mini`).

ZeroGPU requires a HuggingFace PRO account on the Space owner.

Run locally

pip install -r requirements.txt
export HF_TOKEN=hf_xxx          # PowerShell: $env:HF_TOKEN="hf_xxx"
python app.py

Phase 1 (research + script) runs on CPU. Phase 2 (TTS) needs a GPU and the fish-speech package; on CPU-only machines the UI loads but synthesis is disabled.

Models Used

Qwen/Qwen2.5-7B-Instruct For Research and Script Generation
fishaudio/openaudio-s1-mini 0.5b For Audio Generation

Deploy to a Space

huggingface-cli login
huggingface-cli upload <user>/podify . --repo-type=space
# or: git push to the Space remote (preset .wav files tracked via Git LFS)

Credits / assets

Voice samples (tts/voices/): derived from CMU ARCTIC (free for research and commercial use). Rebuild with scripts/build_voice_samples.py.
Background-music loops (tts/music_loops/): FreePD by Kevin MacLeod — 100% public domain (CC0). Rebuild with scripts/build_music_loops.py. A procedural numpy fallback in tts/music.py is used if the loops are absent.

Contributots

nvipin63
jayaspjacob

#backyard-ai

Blog: Article
Social Media Post: Post
Demo: Video