File size: 2,395 Bytes

e146eba

# Customizable Web Scraper  

## Overview  
The **Customizable Web Scraper** is a lightweight Python tool that allows users to extract specific elements from any webpage using a simple graphical interface. Built with **Streamlit**, **BeautifulSoup**, and **Pandas**, this tool enables users to analyze HTML structure, select relevant tags, and download the extracted data in CSV format.  

## Features  
✅ **User-friendly Streamlit interface**  
🔍 **Automatic detection of available HTML tags**  
📌 **Custom tag selection** (`h1`, `h2`, `p`, `a`, `img`, `ul`, etc.)  
📊 **Displays scraped data in a structured table**  
📥 **Download extracted data as a CSV file**  

## Installation  

### Prerequisites  
Ensure you have **Python 3.x** installed on your system.  

### Steps  
1. Clone this repository or download the script:  
   ```sh

   git clone https://github.com/your-repository/Customizable-Scraper.git

   cd Customizable-Scraper

   ```  
2. Install the required dependencies:  
   ```sh

   pip install streamlit requests beautifulsoup4 pandas

   ```  
3. Run the Streamlit app:  
   ```sh

   streamlit run app.py

   ```  

## Usage  

1. **Enter a URL**: Provide the webpage link you want to scrape.  
2. **Analyze the page**: The scraper will identify available HTML tags.  
3. **Select tags**: Choose which elements (headings, paragraphs, links, images, lists, etc.) to extract.  
4. **Scrape Data**: Click the **"Scrape Data"** button to fetch and display the extracted content.  
5. **Download CSV**: Export the scraped data as a CSV file for offline use.  

## Technologies Used  
- **Streamlit** – Interactive UI for user-friendly operation  
- **Requests** – Fetching webpage content  
- **BeautifulSoup4** – Parsing and extracting HTML elements  
- **Pandas** – Structuring and exporting scraped data  

## Limitations  
⚠️ This scraper **cannot**:  
- Extract data from **JavaScript-rendered content**  
- Access **login-restricted** or **protected** pages  
- Scrape sites that block requests in **robots.txt**  

## License  
This project is **open-source** and available for personal and educational use.  

## Contributions  
🔹 Contributions are welcome!  
If you’d like to improve this project, feel free to fork the repository, make enhancements, and submit a **Pull Request**.