Scrape-Anythings / README.md
PHOROTHA913's picture
Upload 9 files
5c3dc0d verified
---
title: Scrape Anythings
emoji:
colorFrom: blue
colorTo: green
sdk: streamlit
sdk_version: "1.35.0"
python_version: "3.9"
app_file: app.py
---
# ✨ Scrape Anythings
A user-friendly Streamlit web application for extracting data from any website, including special support for YouTube and Instagram.
## 🌟 Features
- **Scrape Any URL**: Paste any website, YouTube, or Instagram URL to start.
- **Multiple Data Types**: Extract text, images, links, tables, numbers, and metadata.
- **Social Media Support**: Scrape YouTube video info & comments, and Instagram profile details & posts.
- **Rich Data Export**: Download your data in JSON, CSV, TXT, and structured Excel (.xlsx) formats.
- **Modern UI**: A clean and simple interface for a smooth user experience.
## 🚀 How to Deploy on Hugging Face Spaces
1. **Create a Hugging Face Account**: If you don't have one, sign up at [huggingface.co](https://huggingface.co/).
2. **Create a New Space**:
* Go to [huggingface.co/new-space](https://huggingface.co/new-space).
* Enter a **Space name** (e.g., `scrape-anythings`).
* Select **Streamlit** as the Space SDK.
* Choose **Create a new repository for this Space**.
* Click **Create Space**.
3. **Upload Your Files**:
* In your new Space, go to the **Files** tab.
* Click **Upload files**.
* Drag and drop all the files from your project folder:
* `app.py`
* `scraper.py`
* `youtube_scraper.py`
* `instagram_scraper.py`
* `instagram_scraper_v2.py`
* `requirements.txt`
* `README.md`
* Commit the files directly to the `main` branch.
4. **Done!** Hugging Face will automatically build and launch your application. You can share the URL of your Space with anyone.
## 📋 How to Use the App
1. **Enter a URL**: Paste the URL of the website, YouTube video, or Instagram profile you want to scrape.
2. **Select Data Types**: Choose the data you want to extract.
3. **Click Scrape!**: Let the app do the work.
4. **View & Download**: See the results directly in the app and download them in your preferred format.
- [ ] Real-time scraping status
- [ ] Custom CSS selectors
- [ ] Proxy support
- [ ] Multi-language support
## 🤝 Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Test thoroughly
5. Submit a pull request
## 📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
## 🙏 Acknowledgments
- Streamlit team for the amazing web app framework
- BeautifulSoup and Selenium communities
- Hugging Face for hosting capabilities
---
**Made with ❤️ for the AI/ML community**