File size: 1,665 Bytes
17df267
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
## Description
The `YouTube Transcript Fetcher` is a high-performance, command-line utility designed to extract transcripts directly from YouTube videos using the official internal caption delivery API. By bypassing traditional HTML scraping or headless browser automation, it achieves near-instant retrieval of caption data. The tool supports multiple output formats (Text, JSON, SRT, WebVTT), handles batch processing, and maintains language priority with automatic fallback.

## System Overview

```mermaid

graph TD

    A[User Commands] --> B[main.py CLI Handler]

    B --> C[YouTubeTranscriptApi Instance]

    C --> D[YouTube timedtext Endpoint]

    D -- XML/JSON Data --> C

    C -- List of Snippets --> B

    B --> E{Output Mode}

    E -->|Write to File| F[Exported Transcript]

    E -->|Terminal| G[Standard Output]

```

## Project Structure

```text

youtube-transcript-fetcher/

β”œβ”€β”€ .gitignore

β”œβ”€β”€ GUIDE.md

β”œβ”€β”€ LICENSE

β”œβ”€β”€ main.py

β”œβ”€β”€ README.md

β”œβ”€β”€ STACKS.md

└── requirements.txt

```

## Techstack
Audit of project files (excluding environment and cache):

| File Type | Count | Size (KB) |
| :--- | :--- | :--- |
| Python (.py) | 1 | 9.9 |
| Markdown (.md) | 2 | 8.6 |
| Text (.txt) | 1 | 0.1 |
| Gitignore (.gitignore) | 1 | 0.1 |
| License | 1 | 1.1 |

**Total Files**: 6

## Dependencies
- **Python**: 
  - `youtube-transcript-api`: Core caption data retrieval and formatting.
  - `argparse`: Command-line interface definition and parsing.
  - `requests`: Underlying HTTP request handling (via the API library).
  - `re`: URL parsing and video ID extraction.