| ## Description | |
| The `YouTube Transcript Fetcher` is a high-performance, command-line utility designed to extract transcripts directly from YouTube videos using the official internal caption delivery API. By bypassing traditional HTML scraping or headless browser automation, it achieves near-instant retrieval of caption data. The tool supports multiple output formats (Text, JSON, SRT, WebVTT), handles batch processing, and maintains language priority with automatic fallback. | |
| ## System Overview | |
| ```mermaid | |
| graph TD | |
| A[User Commands] --> B[main.py CLI Handler] | |
| B --> C[YouTubeTranscriptApi Instance] | |
| C --> D[YouTube timedtext Endpoint] | |
| D -- XML/JSON Data --> C | |
| C -- List of Snippets --> B | |
| B --> E{Output Mode} | |
| E -->|Write to File| F[Exported Transcript] | |
| E -->|Terminal| G[Standard Output] | |
| ``` | |
| ## Project Structure | |
| ```text | |
| youtube-transcript-fetcher/ | |
| βββ .gitignore | |
| βββ GUIDE.md | |
| βββ LICENSE | |
| βββ main.py | |
| βββ README.md | |
| βββ STACKS.md | |
| βββ requirements.txt | |
| ``` | |
| ## Techstack | |
| Audit of project files (excluding environment and cache): | |
| | File Type | Count | Size (KB) | | |
| | :--- | :--- | :--- | | |
| | Python (.py) | 1 | 9.9 | | |
| | Markdown (.md) | 2 | 8.6 | | |
| | Text (.txt) | 1 | 0.1 | | |
| | Gitignore (.gitignore) | 1 | 0.1 | | |
| | License | 1 | 1.1 | | |
| **Total Files**: 6 | |
| ## Dependencies | |
| - **Python**: | |
| - `youtube-transcript-api`: Core caption data retrieval and formatting. | |
| - `argparse`: Command-line interface definition and parsing. | |
| - `requests`: Underlying HTTP request handling (via the API library). | |
| - `re`: URL parsing and video ID extraction. | |