Spaces:
Sleeping
Sleeping
| title: Text Summarization | |
| emoji: ๐ | |
| colorFrom: blue | |
| colorTo: indigo | |
| sdk: docker | |
| app_port: 8501 | |
| pinned: false | |
| license: mit | |
| short_description: Summarize YouTube, web pages, and uploaded docs. | |
| # Text Summarization | |
| This Space runs a Streamlit app for summarizing: | |
| - YouTube videos | |
| - website URLs | |
| - uploaded PDF, TXT, MD, CSV, and DOCX files | |
| ## Required Secret | |
| Add this secret in the Space settings: | |
| - `GROQ_API_KEY` | |
| ## YouTube On Hugging Face Spaces | |
| YouTube transcript loading may work locally but fail on Hugging Face Spaces because YouTube frequently blocks or rate-limits datacenter IP ranges. The app now retries transient HTTPS failures and supports proxy configuration through Space secrets: | |
| - `YOUTUBE_HTTP_PROXY` | |
| - `YOUTUBE_HTTPS_PROXY` | |
| You can also use the standard `HTTP_PROXY` and `HTTPS_PROXY` environment variables if that matches your setup. | |
| ## Space-Only YouTube Fallbacks | |
| The Hugging Face Space version now supports multiple YouTube retrieval strategies: | |
| - Direct transcript fetch | |
| - External transcript API | |
| - Audio transcription via `yt-dlp` + Groq Whisper | |
| - Manual transcript paste/upload | |
| ### Optional secrets for external transcript API | |
| - `YOUTUBE_TRANSCRIPT_API_URL` | |
| - `YOUTUBE_TRANSCRIPT_API_KEY` | |
| - `YOUTUBE_TRANSCRIPT_API_METHOD` (`GET` or `POST`, default `GET`) | |
| - `YOUTUBE_TRANSCRIPT_API_KEY_HEADER` (default `Authorization`) | |
| - `YOUTUBE_TRANSCRIPT_API_TIMEOUT` (default `45`) | |
| `YOUTUBE_TRANSCRIPT_API_URL` may contain placeholders such as `{video_id}`, `{url}`, and `{language_code}`. | |
| ### Optional secrets for Groq audio transcription fallback | |
| - `GROQ_AUDIO_TRANSCRIPTION_MODEL` | |
| Default model: `whisper-large-v3-turbo` | |