Spaces:

akshil-jain
/

Video-Transcript-Chatbot

Paused

App Files Files Community

akshil-jain commited on Jun 19, 2025

Commit

97c2b63

verified ·

1 Parent(s): 6df4a14

Upload 3 files

Browse files

Files changed (3) hide show

README.md +74 -14
app.py +111 -0
requirements.txt +8 -0

README.md CHANGED Viewed

@@ -1,14 +1,74 @@
----
-title: Video Transcript Chatbot
-emoji: ⚡
-colorFrom: pink
-colorTo: yellow
-sdk: gradio
-sdk_version: 5.34.1
-app_file: app.py
-pinned: false
-license: mit
-short_description: RAG powered  app that turns any YouTube video into a chatbot
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# Video Transcript Chatbot
+A beginner-friendly Gradio app that turns any YouTube video into a conversational chatbot using LangChain and Hugging Face Inference API.
+---
+## Features
+- **Dynamic Video Input**: Paste a full YouTube URL or raw video ID.
+- **Embedding Model Selection**: Pick any HF embedding model (default: `sentence-transformers/all-MiniLM-L6-v2`).
+- **LLM Model Selection**: Choose any HF text-generation model (default: `meta-llama/Llama-3.1-8B-Instruct`).
+- **Secure Token Entry**: You must enter your own HF API token at runtime—no hard-coded defaults.
+- **Conversational Memory**: Multi-turn chat history is preserved.
+- **Retrieval-Augmented Generation**: Uses FAISS + transcript context to ground answers.
+---
+## Prerequisites
+- **Python 3.8+**
+- **Hugging Face API Token** with Inference access:
+  https://huggingface.co/settings/tokens
+- **Git** (for cloning the repo)
+---
+## Installation
+1. **Clone the repo**
+   ```bash
+   git clone https://github.com/<your-username>/yt-rag-chatbot.git
+   cd yt-rag-chatbot
+2. **(Optional) Create a virtual environment**
+   ```bash
+   python -m venv venv
+   source venv/bin/activate    # macOS/Linux
+   venv\Scripts\activate       # Windows
+3. **Install dependencies**
+   ```bash
+   python -m venv venv
+   source venv/bin/activate    # macOS/Linux
+   venv\Scripts\activate       # Windows
+**Usage**
+1. **Start the app:**
+   ```bash
+   python app.py
+2. **Open** your browser at the local URL (e.g. http://127.0.0.1:7860)
+3. **Use the UI:**
+- **YouTube Video URL or ID:** Paste your link/ID.
+- **Embedding Model:** Leave default or enter another HF embedding model.
+- **LLM Model:** Enter your desired HF LLM repo.
+- **Your HF API Token:** Paste your token (input hidden).
+- Click **Initialize Chat** to load and index the transcript.
+- Ask questions in the chat window to interact with the video content.
+**Customization**
+- **Default Models:** Edit the default values for embedding_model_input and llm_model_input in app.py.
+- **Retrieval Size:** Change the k value in the retriever configuration:
+  ```python
+  retriever = vector_store.as_retriever(search_kwargs={'k': 4})

app.py ADDED Viewed

	@@ -0,0 +1,111 @@

+import os
+import re
+import gradio as gr
+from youtube_transcript_api import YouTubeTranscriptApi, TranscriptsDisabled
+from langchain.text_splitter import RecursiveCharacterTextSplitter
+from langchain_huggingface import HuggingFaceEndpointEmbeddings, HuggingFaceEndpoint, ChatHuggingFace
+from langchain_community.vectorstores import FAISS
+from langchain.prompts import PromptTemplate
+from langchain.memory import ConversationBufferMemory
+from langchain.chains import ConversationalRetrievalChain
+# No default token: user must supply their Hugging Face API token via the UI
+def extract_video_id(url_or_id: str) -> str:
+    pattern = r"(?:v=|\/)([0-9A-Za-z_-]{11})"
+    match = re.search(pattern, url_or_id)
+    return match.group(1) if match else url_or_id
+# Load, embed, and index the transcript
+def load_vector_store(video_id: str, huggingface_token: str, embedding_model: str):
+    # Temporarily set the token for embedding calls
+    os.environ['HUGGINGFACEHUB_API_TOKEN'] = huggingface_token.strip()
+    try:
+        transcript_list = YouTubeTranscriptApi.get_transcript(video_id, languages=['en'])
+        transcript = ' '.join(chunk['text'] for chunk in transcript_list)
+    except TranscriptsDisabled:
+        transcript = ''
+    splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
+    docs = splitter.create_documents([transcript])
+    embeddings = HuggingFaceEndpointEmbeddings(
+        model=embedding_model,
+        huggingfacehub_api_token=os.environ['HUGGINGFACEHUB_API_TOKEN']
+    )
+    return FAISS.from_documents(docs, embeddings)
+# Initialize/reinitialize the QA chain
+def setup(video_input, embedding_model, llm_model, huggingface_token):
+    video_id = extract_video_id(video_input)
+    vector_store = load_vector_store(video_id, huggingface_token, embedding_model)
+    retriever = vector_store.as_retriever(search_type='similarity', search_kwargs={'k': 4})
+    prompt_template = '''
+You are a helpful assistant.
+Answer ONLY from the provided transcript context.
+If the context is insufficient, say you don't know.
+{context}
+Question: {question}
+'''
+    prompt = PromptTemplate(template=prompt_template, input_variables=['context', 'question'])
+    memory = ConversationBufferMemory(memory_key='chat_history', return_messages=True)
+    # Configure the LLM endpoint
+    os.environ['HUGGINGFACEHUB_API_TOKEN'] = huggingface_token.strip()
+    hf_llm = HuggingFaceEndpoint(
+        repo_id=llm_model,
+        task='text-generation',
+        max_new_tokens=512,
+        temperature=0.2,
+        huggingfacehub_api_token=os.environ['HUGGINGFACEHUB_API_TOKEN']
+    )
+    chat_model = ChatHuggingFace(llm=hf_llm, verbose=True)
+    qa_chain = ConversationalRetrievalChain.from_llm(
+        llm=chat_model,
+        retriever=retriever,
+        memory=memory,
+        chain_type='stuff',
+        return_source_documents=False
+    )
+    # Reset chat history
+    return [], [], qa_chain
+# Handle chat interactions
+def respond(message, chat_history, qa_chain):
+    result = qa_chain({'question': message, 'chat_history': chat_history})
+    answer = result.get('answer') or result.get('result')
+    chat_history.append((message, answer))
+    return chat_history, chat_history
+# Gradio UI layout
+with gr.Blocks() as demo:
+    gr.Markdown('# Video Transcript Chatbot')
+    with gr.Row():
+        video_input = gr.Textbox(label='YouTube Video URL or ID', value='')
+        embedding_model_input = gr.Textbox(
+            label='Embedding Model (default: sentence-transformers/all-MiniLM-L6-v2)',
+            value='sentence-transformers/all-MiniLM-L6-v2'
+        )
+        llm_model_input = gr.Textbox(label='LLM Model Repo (e.g. google/flan-t5-large)', value='meta-llama/Llama-3.1-8B-Instruct')
+        token_input = gr.Textbox(label='Your HF API Token', placeholder='hf_...', type='password')
+        init_btn = gr.Button('Initialize Chat')
+    chatbot = gr.Chatbot()
+    chat_state = gr.State([])
+    chain_state = gr.State(None)
+    init_btn.click(
+        setup,
+        inputs=[video_input, embedding_model_input, llm_model_input, token_input],
+        outputs=[chatbot, chat_state, chain_state]
+    )
+    txt = gr.Textbox(placeholder='Ask a question about the video...', show_label=False)
+    txt.submit(respond, inputs=[txt, chat_state, chain_state], outputs=[chatbot, chat_state])
+    gr.Button('Clear Chat').click(lambda: ([], []), None, [chatbot, chat_state])
+if __name__ == '__main__':
+    demo.launch()  # pass share=True or host/port if needed

requirements.txt ADDED Viewed

	@@ -0,0 +1,8 @@

+youtube-transcript-api
+langchain-community
+langchain-openai
+faiss-cpu
+tiktoken
+python-dotenv
+langchain-huggingface
+gradio