akshil-jain's picture
Update README.md
df453a5 verified

A newer version of the Gradio SDK is available: 6.2.0

Upgrade
metadata
title: Video Transcript Chatbot
emoji: 🎥
colorFrom: yellow
colorTo: indigo
sdk: gradio
python_version: '3.10'

Video Transcript Chatbot

A beginner-friendly Gradio app that turns any YouTube video into a conversational chatbot using LangChain and Hugging Face Inference API.


Features

  • Dynamic Video Input: Paste a full YouTube URL or raw video ID.
  • Embedding Model Selection: Pick any HF embedding model (default: sentence-transformers/all-MiniLM-L6-v2).
  • LLM Model Selection: Choose any HF text-generation model (default: meta-llama/Llama-3.1-8B-Instruct).
  • Secure Token Entry: You must enter your own HF API token at runtime—no hard-coded defaults.
  • Conversational Memory: Multi-turn chat history is preserved.
  • Retrieval-Augmented Generation: Uses FAISS + transcript context to ground answers.

Prerequisites


Installation

  1. Clone the repo
    git clone https://github.com/<your-username>/yt-rag-chatbot.git
    cd yt-rag-chatbot
    
  2. (Optional) Create a virtual environment
    python -m venv venv
    source venv/bin/activate    # macOS/Linux
    venv\Scripts\activate       # Windows
    
  3. Install dependencies
    python -m venv venv
    source venv/bin/activate    # macOS/Linux
    venv\Scripts\activate       # Windows
    

Usage

  1. Start the app:
    python app.py
    
  2. Open your browser at the local URL (e.g. http://127.0.0.1:7860)
  3. Use the UI:
  • YouTube Video URL or ID: Paste your link/ID.

  • Embedding Model: Leave default or enter another HF embedding model.

  • LLM Model: Enter your desired HF LLM repo.

  • Your HF API Token: Paste your token (input hidden).

  • Click Initialize Chat to load and index the transcript.

  • Ask questions in the chat window to interact with the video content.

Customization

  • Default Models: Edit the default values for embedding_model_input and llm_model_input in app.py.

  • Retrieval Size: Change the k value in the retriever configuration:

    retriever = vector_store.as_retriever(search_kwargs={'k': 4})