Spaces:
Build error
Build error
| title: Transcriptinator | |
| emoji: ποΈ | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: 6.12.0 | |
| app_file: app.py | |
| pinned: false | |
| # ποΈ Transcriptinator | |
| Simple, fast audio transcription powered by Google's Gemini AI. | |
| ## Features | |
| - π― **Simple & Fast** - Upload audio, get transcript in ~20-50 seconds | |
| - π **Smart Summaries** - Automatic summary and key ideas extraction | |
| - π **Private** - Your API key, your data - nothing stored | |
| - π° **Free** - Uses your own Gemini API key (free tier: 15 requests/min) | |
| - π **Markdown Output** - Clean, formatted transcripts ready to download | |
| ## How to Use | |
| ### 1. Get a Gemini API Key (Free) | |
| 1. Go to [Google AI Studio](https://aistudio.google.com/app/apikey) | |
| 2. Click "Create API key" | |
| 3. Copy the key | |
| ### 2. Transcribe Audio | |
| 1. Upload your audio file (max 10 minutes) | |
| - Supported formats: MP3, WAV, M4A, OGG, FLAC, WEBM | |
| 2. Paste your API key | |
| 3. Click "π Transcribe Audio" | |
| 4. Wait ~20-50 seconds | |
| 5. Download your transcript! | |
| ## What You Get | |
| Your transcript includes: | |
| ```yaml | |
| --- | |
| title: "Your Audio File" | |
| date_processed: "2025-12-24" | |
| summary: "Quick 2-3 sentence overview..." | |
| key_ideas: | |
| - idea: "Main Point 1" | |
| description: "Explanation..." | |
| - idea: "Main Point 2" | |
| description: "Explanation..." | |
| note_id: "unique-id" | |
| --- | |
| ## Key Ideas | |
| - **Main Point 1:** Explanation... | |
| - **Main Point 2:** Explanation... | |
| ## Full Transcription | |
| [00:00] Speaker 1: Hello... | |
| [00:15] Speaker 2: Welcome... | |
| ``` | |
| ## Limitations | |
| - **Maximum audio length:** 10 minutes (free HuggingFace tier timeout limit) | |
| - **Processing time:** ~20-50 seconds depending on audio length | |
| - **API rate limits:** 15 requests/minute (Gemini free tier) | |
| ## Privacy & Security | |
| β **Your API key is never stored** - Used only for the current request | |
| β **Audio files are temporary** - Deleted immediately after processing | |
| β **No data collection** - Everything runs through your own API key | |
| ## Technical Details | |
| **AI Calls per transcription:** 3 | |
| 1. Transcription (with timestamps and speakers) | |
| 2. Summary generation | |
| 3. Key ideas extraction | |
| **Processing time estimate:** | |
| - 2-minute audio: ~22 seconds | |
| - 5-minute audio: ~35 seconds | |
| - 10-minute audio: ~50 seconds | |
| ## Troubleshooting | |
| **"Invalid API key"** | |
| - Make sure you copied the entire key | |
| - Generate a new key at [Google AI Studio](https://aistudio.google.com/app/apikey) | |
| **"Audio file too long"** | |
| - Maximum is 10 minutes for free tier | |
| - Split longer files or use the [CLI version](https://github.com/YOUR_USERNAME/transcriptinator) | |
| **"Processing timeout"** | |
| - Audio might be too long or corrupted | |
| - Try with a shorter, clearer audio file | |
| ## Local Installation | |
| Want to run unlimited length audio? Clone the full version: | |
| ``bash | |
| git clone https://github.com/YOUR_USERNAME/transcriptinator | |
| cd transcriptinator | |
| pip install -r requirements.txt | |
| python audio_process_and_transcribe.py your_audio_folder -o output_folder | |
| ``` | |
| ## Credits | |
| Built with: | |
| - [Gradio](https://gradio.app/) - Web interface | |
| - [Google Gemini](https://ai.google.dev/) - AI transcription | |
| - [HuggingFace Spaces](https://huggingface.co/spaces) - Hosting | |
| ## License | |
| MIT License - Feel free to use and modify! |