Text_Summarization / README.md
Codex
Add Space-only YouTube fallback strategies
e6f021c
---
title: Text Summarization
emoji: ๐Ÿ“
colorFrom: blue
colorTo: indigo
sdk: docker
app_port: 8501
pinned: false
license: mit
short_description: Summarize YouTube, web pages, and uploaded docs.
---
# Text Summarization
This Space runs a Streamlit app for summarizing:
- YouTube videos
- website URLs
- uploaded PDF, TXT, MD, CSV, and DOCX files
## Required Secret
Add this secret in the Space settings:
- `GROQ_API_KEY`
## YouTube On Hugging Face Spaces
YouTube transcript loading may work locally but fail on Hugging Face Spaces because YouTube frequently blocks or rate-limits datacenter IP ranges. The app now retries transient HTTPS failures and supports proxy configuration through Space secrets:
- `YOUTUBE_HTTP_PROXY`
- `YOUTUBE_HTTPS_PROXY`
You can also use the standard `HTTP_PROXY` and `HTTPS_PROXY` environment variables if that matches your setup.
## Space-Only YouTube Fallbacks
The Hugging Face Space version now supports multiple YouTube retrieval strategies:
- Direct transcript fetch
- External transcript API
- Audio transcription via `yt-dlp` + Groq Whisper
- Manual transcript paste/upload
### Optional secrets for external transcript API
- `YOUTUBE_TRANSCRIPT_API_URL`
- `YOUTUBE_TRANSCRIPT_API_KEY`
- `YOUTUBE_TRANSCRIPT_API_METHOD` (`GET` or `POST`, default `GET`)
- `YOUTUBE_TRANSCRIPT_API_KEY_HEADER` (default `Authorization`)
- `YOUTUBE_TRANSCRIPT_API_TIMEOUT` (default `45`)
`YOUTUBE_TRANSCRIPT_API_URL` may contain placeholders such as `{video_id}`, `{url}`, and `{language_code}`.
### Optional secrets for Groq audio transcription fallback
- `GROQ_AUDIO_TRANSCRIPTION_MODEL`
Default model: `whisper-large-v3-turbo`