Spaces:

emmajeed
/

transcriptinator_v2

Build error

App Files Files Community

transcriptinator_v2 / README.md

emmajeed

Update README.md

eac7afc verified about 1 month ago

preview code

raw

history blame contribute delete

3.18 kB

A newer version of the Gradio SDK is available: 6.15.0

Upgrade

metadata

title: Transcriptinator
emoji: 🎙️
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 6.12.0
app_file: app.py
pinned: false

🎙️ Transcriptinator

Simple, fast audio transcription powered by Google's Gemini AI.

Features

🎯 Simple & Fast - Upload audio, get transcript in ~20-50 seconds
📝 Smart Summaries - Automatic summary and key ideas extraction
🔒 Private - Your API key, your data - nothing stored
💰 Free - Uses your own Gemini API key (free tier: 15 requests/min)
📄 Markdown Output - Clean, formatted transcripts ready to download

How to Use

1. Get a Gemini API Key (Free)

Go to Google AI Studio
Click "Create API key"
Copy the key

2. Transcribe Audio

Upload your audio file (max 10 minutes)
- Supported formats: MP3, WAV, M4A, OGG, FLAC, WEBM
Paste your API key
Click "🚀 Transcribe Audio"
Wait ~20-50 seconds
Download your transcript!

What You Get

Your transcript includes:

---
title: "Your Audio File"
date_processed: "2025-12-24"
summary: "Quick 2-3 sentence overview..."
key_ideas:
  - idea: "Main Point 1"
    description: "Explanation..."
  - idea: "Main Point 2"
    description: "Explanation..."
note_id: "unique-id"
---

## Key Ideas
- **Main Point 1:** Explanation...
- **Main Point 2:** Explanation...

## Full Transcription
[00:00] Speaker 1: Hello...
[00:15] Speaker 2: Welcome...

Limitations

Maximum audio length: 10 minutes (free HuggingFace tier timeout limit)
Processing time: ~20-50 seconds depending on audio length
API rate limits: 15 requests/minute (Gemini free tier)

Privacy & Security

✅ Your API key is never stored - Used only for the current request
✅ Audio files are temporary - Deleted immediately after processing
✅ No data collection - Everything runs through your own API key

Technical Details

AI Calls per transcription: 3

Transcription (with timestamps and speakers)
Summary generation
Key ideas extraction

Processing time estimate:

2-minute audio: ~22 seconds
5-minute audio: ~35 seconds
10-minute audio: ~50 seconds

Troubleshooting

"Invalid API key"

Make sure you copied the entire key
Generate a new key at Google AI Studio

"Audio file too long"

Maximum is 10 minutes for free tier
Split longer files or use the CLI version

"Processing timeout"

Audio might be too long or corrupted
Try with a shorter, clearer audio file

Local Installation

Want to run unlimited length audio? Clone the full version:

``bash git clone https://github.com/YOUR_USERNAME/transcriptinator cd transcriptinator pip install -r requirements.txt python audio_process_and_transcribe.py your_audio_folder -o output_folder


## Credits

Built with:
- [Gradio](https://gradio.app/) - Web interface
- [Google Gemini](https://ai.google.dev/) - AI transcription
- [HuggingFace Spaces](https://huggingface.co/spaces) - Hosting

## License

MIT License - Feel free to use and modify!