File size: 2,298 Bytes
df453a5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
---
title: Video Transcript Chatbot
emoji: 🎥
colorFrom: yellow
colorTo: indigo
sdk: gradio
python_version: '3.10'
---

# Video Transcript Chatbot

A beginner-friendly Gradio app that turns any YouTube video into a conversational chatbot using LangChain and Hugging Face Inference API.

---

## Features

- **Dynamic Video Input**: Paste a full YouTube URL or raw video ID.  
- **Embedding Model Selection**: Pick any HF embedding model (default: `sentence-transformers/all-MiniLM-L6-v2`).  
- **LLM Model Selection**: Choose any HF text-generation model (default: `meta-llama/Llama-3.1-8B-Instruct`).  
- **Secure Token Entry**: You must enter your own HF API token at runtime—no hard-coded defaults.  
- **Conversational Memory**: Multi-turn chat history is preserved.  
- **Retrieval-Augmented Generation**: Uses FAISS + transcript context to ground answers.

---

## Prerequisites

- **Python 3.8+**  
- **Hugging Face API Token** with Inference access:  
  https://huggingface.co/settings/tokens  
- **Git** (for cloning the repo)

---

## Installation

1. **Clone the repo**  
   ```bash
   git clone https://github.com/<your-username>/yt-rag-chatbot.git
   cd yt-rag-chatbot
2. **(Optional) Create a virtual environment**
   ```bash
   python -m venv venv
   source venv/bin/activate    # macOS/Linux
   venv\Scripts\activate       # Windows
3. **Install dependencies**
   ```bash
   python -m venv venv
   source venv/bin/activate    # macOS/Linux
   venv\Scripts\activate       # Windows

**Usage**
1. **Start the app:**
   ```bash
   python app.py
2. **Open** your browser at the local URL (e.g. http://127.0.0.1:7860)
3. **Use the UI:**

- **YouTube Video URL or ID:** Paste your link/ID.

- **Embedding Model:** Leave default or enter another HF embedding model.

- **LLM Model:** Enter your desired HF LLM repo.

- **Your HF API Token:** Paste your token (input hidden).

- Click **Initialize Chat** to load and index the transcript.

- Ask questions in the chat window to interact with the video content.

  
**Customization**

- **Default Models:** Edit the default values for embedding_model_input and llm_model_input in app.py.

- **Retrieval Size:** Change the k value in the retriever configuration:
  ```python
  retriever = vector_store.as_retriever(search_kwargs={'k': 4})