muddasser commited on
Commit
6b33a86
Β·
verified Β·
1 Parent(s): 03ce648

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -23
README.md CHANGED
@@ -1,41 +1,48 @@
1
  ---
2
- title: WebScraping with Selenium + RAG
3
  emoji: πŸ•·οΈ
4
  colorFrom: red
5
  colorTo: red
6
- sdk: streamlit
7
  app_file: app.py
8
  app_port: 8501
9
  tags:
10
- - streamlit
11
- - selenium
12
- - rag
13
- - flan-t5
14
- - web-scraping
15
  pinned: true
16
- short_description: A Streamlit RAG Selenium + FLAN-T5-small.
17
  ---
18
 
 
19
 
20
- # πŸ•·οΈ Web Scraping + RAG Chatbot πŸš€
21
 
22
- This project combines **Selenium web scraping** with **Retrieval-Augmented Generation (RAG)** to build an intelligent chatbot.
23
- It can scrape live websites, index the content into a **FAISS vector store**, and let you ask natural language questions.
 
24
 
25
- ### πŸ”Ή Features
26
- - 🌐 Scrape dynamic websites using **Selenium**
27
- - πŸ“š Store & retrieve content with **FAISS embeddings**
28
- - 🧠 Answer questions using **FLAN-T5-small** (runs on CPU)
29
- - 🎨 Simple **Streamlit UI** for interaction
30
 
31
- ### πŸš€ Usage
32
- 1. Enter a URL to scrape.
33
- 2. The system extracts + indexes the text.
34
- 3. Ask questions β€” the chatbot answers using RAG.
 
35
 
36
- ---
 
 
37
 
38
- πŸ‘©β€πŸ’» **Tech Stack**: `Streamlit`, `Selenium`, `LangChain`, `FAISS`, `Transformers`
 
 
39
 
40
- πŸ“– Check the [docs](https://docs.streamlit.io) for customizing your Streamlit app.
41
 
 
 
 
 
 
1
  ---
2
+ title: Web Scraping with Selenium + RAG
3
  emoji: πŸ•·οΈ
4
  colorFrom: red
5
  colorTo: red
6
+ sdk: docker
7
  app_file: app.py
8
  app_port: 8501
9
  tags:
10
+ - streamlit
11
+ - selenium
12
+ - rag
13
+ - flan-t5
14
+ - web-scraping
15
  pinned: true
16
+ short_description: Selenium RAG using FLAN-T5-small
17
  ---
18
 
19
+ # πŸ•·οΈ Web Scraping + RAG Chatbot
20
 
21
+ This project combines **Selenium web scraping** with **Retrieval-Augmented Generation (RAG)** to build an intelligent chatbot that can extract information from websites and answer questions about the content.
22
 
23
+ ![Demo](https://img.shields.io/badge/Demo-Live%20Demo-blue)
24
+ ![Python](https://img.shields.io/badge/Python-3.10%2B-blue)
25
+ ![License](https://img.shields.io/badge/License-MIT-green)
26
 
27
+ ## ✨ Features
 
 
 
 
28
 
29
+ - 🌐 **Web Scraping**: Extract content from dynamic websites using Selenium
30
+ - πŸ“š **Vector Storage**: Index and retrieve content using FAISS embeddings
31
+ - 🧠 **Question Answering**: Generate answers using FLAN-T5-small model
32
+ - 🎨 **User-Friendly Interface**: Simple Streamlit UI for interaction
33
+ - 🐳 **Dockerized**: Ready for deployment on Hugging Face Spaces
34
 
35
+ ## πŸš€ Quick Start
36
+
37
+ ### Prerequisites
38
 
39
+ - Python 3.10+
40
+ - Docker (for containerized deployment)
41
+ - Hugging Face account (for deployment)
42
 
43
+ ### Local Installation
44
 
45
+ 1. Clone the repository:
46
+ ```bash
47
+ git clone https://huggingface.co/spaces/your-username/your-space-name
48
+ cd your-space-name