Spaces:
Build error
Build error
File size: 3,181 Bytes
71a0fd4 7ee2bc7 71a0fd4 eac7afc 71a0fd4 7ee2bc7 eac7afc | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 | ---
title: Transcriptinator
emoji: ποΈ
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 6.12.0
app_file: app.py
pinned: false
---
# ποΈ Transcriptinator
Simple, fast audio transcription powered by Google's Gemini AI.
## Features
- π― **Simple & Fast** - Upload audio, get transcript in ~20-50 seconds
- π **Smart Summaries** - Automatic summary and key ideas extraction
- π **Private** - Your API key, your data - nothing stored
- π° **Free** - Uses your own Gemini API key (free tier: 15 requests/min)
- π **Markdown Output** - Clean, formatted transcripts ready to download
## How to Use
### 1. Get a Gemini API Key (Free)
1. Go to [Google AI Studio](https://aistudio.google.com/app/apikey)
2. Click "Create API key"
3. Copy the key
### 2. Transcribe Audio
1. Upload your audio file (max 10 minutes)
- Supported formats: MP3, WAV, M4A, OGG, FLAC, WEBM
2. Paste your API key
3. Click "π Transcribe Audio"
4. Wait ~20-50 seconds
5. Download your transcript!
## What You Get
Your transcript includes:
```yaml
---
title: "Your Audio File"
date_processed: "2025-12-24"
summary: "Quick 2-3 sentence overview..."
key_ideas:
- idea: "Main Point 1"
description: "Explanation..."
- idea: "Main Point 2"
description: "Explanation..."
note_id: "unique-id"
---
## Key Ideas
- **Main Point 1:** Explanation...
- **Main Point 2:** Explanation...
## Full Transcription
[00:00] Speaker 1: Hello...
[00:15] Speaker 2: Welcome...
```
## Limitations
- **Maximum audio length:** 10 minutes (free HuggingFace tier timeout limit)
- **Processing time:** ~20-50 seconds depending on audio length
- **API rate limits:** 15 requests/minute (Gemini free tier)
## Privacy & Security
β
**Your API key is never stored** - Used only for the current request
β
**Audio files are temporary** - Deleted immediately after processing
β
**No data collection** - Everything runs through your own API key
## Technical Details
**AI Calls per transcription:** 3
1. Transcription (with timestamps and speakers)
2. Summary generation
3. Key ideas extraction
**Processing time estimate:**
- 2-minute audio: ~22 seconds
- 5-minute audio: ~35 seconds
- 10-minute audio: ~50 seconds
## Troubleshooting
**"Invalid API key"**
- Make sure you copied the entire key
- Generate a new key at [Google AI Studio](https://aistudio.google.com/app/apikey)
**"Audio file too long"**
- Maximum is 10 minutes for free tier
- Split longer files or use the [CLI version](https://github.com/YOUR_USERNAME/transcriptinator)
**"Processing timeout"**
- Audio might be too long or corrupted
- Try with a shorter, clearer audio file
## Local Installation
Want to run unlimited length audio? Clone the full version:
``bash
git clone https://github.com/YOUR_USERNAME/transcriptinator
cd transcriptinator
pip install -r requirements.txt
python audio_process_and_transcribe.py your_audio_folder -o output_folder
```
## Credits
Built with:
- [Gradio](https://gradio.app/) - Web interface
- [Google Gemini](https://ai.google.dev/) - AI transcription
- [HuggingFace Spaces](https://huggingface.co/spaces) - Hosting
## License
MIT License - Feel free to use and modify! |