Spaces:

MCP-1st-Birthday
/

AI-RADIO

Sleeping

App Files Files Community

AI-RADIO / README.md

nikimakarov

Update README.md

a39eb5e verified 2 months ago

preview code

raw

history blame contribute delete

9.59 kB

	---
	title: AI Radio - Personalized Radio Station
	emoji: 🎵
	colorFrom: purple
	colorTo: pink
	sdk: gradio
	sdk_version: 4.44.1
	app_file: app.py
	pinned: false
	license: mit
	short_description: Entertaining AI Radio with LLM as a Host
	tags:
	- mcp-in-action-track-consumer
	- mcp-in-action-track-creative
	---

	# 🎵 AI Radio - Your Personal Radio Station

	## 🌟 Overview

	AI Radio is an intelligent, personalized radio station powered by cutting-edge AI technology. It creates a unique listening experience tailored to your preferences, mood, and interests. Built for the MCP 1st Birthday Competition, this app demonstrates autonomous agent behavior, MCP integration, and advanced RAG capabilities.

	## 🦾 Participants
	https://huggingface.co/nikimakarov

	## Social
	https://x.com/nikimakarov/status/1995277463337967838

	## Demo on Youtube
	https://youtu.be/tAIlor2TieM

	## 🛠️ Technology Stack

	- Gradio: Interactive web interface
	- OpenAI GPT-OSS-120B provided by Nebius : LLM for content generation, host commentary, and reasoning
	- ElevenLabs: High-quality text-to-speech for voice generation
	- Modal Youtube-proxy service
	- LlamaIndex: RAG system for personalized recommendations and user preference management
	- MCP (Model Context Protocol): Structured tool servers for modular functionality
	- Python: Core application logic
	- RSS Feeds: Real-time news aggregation

	## 🙏 Acknowledgments

	- Gradio Team for the beautiful UI framework
	- MCP Team for the amazing protocol and competition
	- OpenAI for creating GPT-OSS-120B
	- Modal for providing servers for proxy
	- Nebius for providing models API
	- ElevenLabs for text-to-speech technology
	- LlamaIndex for RAG capabilities

	## 📁 Project Structure

	```
	ai_radio/
	├── src/ # Source code
	│ ├── app.py # Main Gradio application
	│ ├── config.py # Configuration
	│ ├── radio_agent.py # AI agent logic
	│ ├── tts_service.py # Text-to-speech service
	│ ├── rag_system.py # RAG system
	│ ├── voice_input.py # Voice input handling
	│ └── mcp_servers/ # MCP server implementations
	├── docs/ # Documentation files
	├── audio/ # Generated audio files
	├── logs/ # Log files
	├── requirements.txt # Python dependencies
	└── run.py # Application entry point
	```

	## ✨ Features

	### 🎵 Personalized Music
	- Curated tracks based on your favorite genres and current mood
	- Free music recommendations from multiple genres
	- Dynamic DJ commentary generated by AI

	### 📰 Custom News Updates
	- Real-time news from RSS feeds
	- Personalized news based on your interests (technology, world, business, entertainment, science)
	- AI-generated news scripts delivered in a conversational style

	### 🎙️ Podcast Recommendations
	- Discover trending podcasts in various categories
	- Personalized recommendations based on your interests
	- Engaging introductions to each podcast

	### 📖 AI-Generated Stories
	- Entertaining stories and fascinating fun facts
	- Tailored to your interests and mood
	- Perfect short-form content between segments

	### 🤖 AI Radio Host
	- Charismatic AI host powered by Nebius GPT-OSS-120B
	- Personalized greetings and interactions
	- Smooth transitions between segments

	### 💾 Smart Recommendations (RAG)
	- LlamaIndex-powered RAG system learns from your listening history
	- Improves recommendations over time
	- Stores your preferences and provides insights

	## 🏆 MCP Competition Requirements

	This app fulfills all requirements for Track 2: MCP in Action:

	✅ Autonomous Agent Behavior: The Radio Agent demonstrates planning (show planning), reasoning (segment selection based on preferences), and execution (content generation and delivery)

	✅ MCP Servers as Tools: Implements three MCP servers:
	- MusicMCPServer: Music search and playlist generation
	- NewsMCPServer: News fetching and curation
	- PodcastMCPServer: Podcast discovery and recommendations

	✅ Gradio App: Beautiful, intuitive radio station interface

	✅ Advanced Features:
	- RAG System: Uses LlamaIndex for context-aware recommendations
	- Context Engineering: Personalizes content based on user preferences and history
	- Streaming Capabilities: Efficient audio generation and playback


	## 🚀 Getting Started

	### Option 1: Deploy to HuggingFace Spaces (Recommended) 🚀

	Quick Deploy:

	1. Create a new Space on [HuggingFace Spaces](https://huggingface.co/spaces)
	- SDK: Gradio
	- Hardware: CPU (or GPU if available)

	2. Configure Secrets (Settings → Secrets):
	- `ELEVENLABS_API_KEY` - Your ElevenLabs API key
	- `NEBIUS_API_KEY` - Your Nebius API key (for GPT-OSS-120B)
	- `LLAMAINDEX_API_KEY` - Optional, for LlamaIndex embeddings
	- `OPENAI_API_KEY` - Optional, for OpenAI embeddings fallback

	3. Push your code to the Space repository

	4. Access your Space at `https://huggingface.co/spaces/<username>/<space-name>`

	📖 Detailed deployment guide: See [DEPLOY_HF.md](DEPLOY_HF.md)

	### Option 2: Local Installation

	### Prerequisites

	- Python 3.9+
	- Nebius API key (for GPT-OSS-120B LLM)
	- ElevenLabs API key

	### Installation

	1. Clone the repository:
	```bash
	git clone <your-repo-url>
	cd ai_radio
	```

	2. Install dependencies:
	```bash
	pip install -r requirements.txt
	```

	3. Set up environment variables or edit `src/config.py`:
	```bash
	export ELEVENLABS_API_KEY="your-key-here"
	export NEBIUS_API_KEY="your-key-here"
	export LLAMAINDEX_API_KEY="your-key-here"
	```

	4. Run the app:
	```bash
	python run.py
	```

	The app will be available at `http://localhost:7871`

	5. Open your browser to `http://localhost:7871`

	## 📖 How to Use

	### 1️⃣ Set Your Preferences
	- Go to the "Your Preferences" tab
	- Enter your name
	- Select your favorite music genres
	- Choose news topics you're interested in
	- Pick podcast categories
	- Set your current mood
	- Click "Save Preferences"

	### 2️⃣ Start Your Radio
	- Navigate to the "Radio Player" tab
	- Click "▶️ Start Radio"
	- Enjoy your personalized radio show!

	### 3️⃣ Controls
	- ▶️ Start Radio: Begin your personalized show
	- ⏭️ Next Segment: Skip to the next segment
	- ⏹️ Stop: Pause the radio (you can resume later)

	### 4️⃣ Track Your Stats
	- Visit the "Your Stats" tab
	- See your listening history and statistics
	- The RAG system uses this data to improve recommendations

	## 🎨 User Interface

	The app features a beautiful, modern radio station interface with:
	- Gradient color schemes for visual appeal
	- Responsive layout that works on all devices
	- Real-time status updates showing what's playing
	- Progress tracking through your show
	- Audio player with auto-play functionality

	## 🧠 How It Works

	### Agent Architecture

	The RadioAgent is an autonomous AI agent that:

	1. Plans: Analyzes user preferences and creates a balanced show plan with music, news, podcasts, and stories
	2. Reasons: Uses the RAG system to make intelligent decisions about content selection
	3. Executes: Generates content using Nebius GPT-OSS-120B LLM and delivers it via ElevenLabs TTS

	### MCP Server Architecture

	Three specialized MCP servers provide modular functionality:

	```python
	# Music Server
	- search_music(genre, mood, limit)
	- get_personalized_playlist(user_preferences)

	# News Server
	- fetch_news(category, limit)
	- get_personalized_news(user_preferences)

	# Podcast Server
	- get_trending_podcasts(category, limit)
	- get_personalized_podcasts(user_preferences)
	```

	### RAG System

	The RAG system powered by LlamaIndex:
	- Stores user preferences and listening history
	- Provides context-aware recommendations
	- Learns from user behavior over time
	- Supports semantic search over user data

	## 🎯 Show Planning Algorithm

	The agent uses intelligent planning to create a balanced show:

	1. Analyze preferences: Load user preferences from RAG system
	2. Calculate distribution: Determine segment ratios (50% music, 20% news, 20% podcasts, 10% stories)
	3. Generate segments: Use MCP servers to fetch content for each segment
	4. Add personality: Generate host commentary using Nebius GPT-OSS-120B LLM
	5. Execute: Convert text to speech and play audio

	## 🔊 Audio Generation

	- Text-to-Speech: ElevenLabs API converts host commentary to natural-sounding speech
	- Voice Selection: Uses the "Rachel" voice by default (customizable)
	- Streaming: Efficient audio generation and playback
	- Quality: High-quality MP3 output

	## 📊 Personalization

	The app learns from your behavior:
	- Preference Storage: Saves your favorite genres, interests, and moods
	- History Tracking: Records what you've listened to
	- Recommendation Engine: Uses RAG to suggest relevant content
	- Adaptive Planning: Adjusts future shows based on your history

	## 🔒 Privacy

	- All user data is stored locally
	- No data is shared with third parties (except API calls to service providers)
	- You can clear your history at any time

	## 🤝 Contributing

	This is a hackathon submission, but contributions and feedback are welcome!

	## 📄 License

	MIT License - feel free to use and modify as needed.

	## 📧 Contact

	Built with ❤️ for the MCP 1st Birthday Competition
	Contact: @nikimakarov

	---

	Track: MCP in Action - Consumer Applications
	Tag: `mcp-in-action-track-consumer`

	🎵 Enjoy your personalized radio experience! 🎵