Spaces:
Sleeping
Sleeping
A newer version of the Streamlit SDK is available:
1.52.2
metadata
title: Scrape Anythings
emoji: ✨
colorFrom: blue
colorTo: green
sdk: streamlit
sdk_version: 1.35.0
python_version: '3.9'
app_file: app.py
✨ Scrape Anythings
A user-friendly Streamlit web application for extracting data from any website, including special support for YouTube and Instagram.
🌟 Features
- Scrape Any URL: Paste any website, YouTube, or Instagram URL to start.
- Multiple Data Types: Extract text, images, links, tables, numbers, and metadata.
- Social Media Support: Scrape YouTube video info & comments, and Instagram profile details & posts.
- Rich Data Export: Download your data in JSON, CSV, TXT, and structured Excel (.xlsx) formats.
- Modern UI: A clean and simple interface for a smooth user experience.
🚀 How to Deploy on Hugging Face Spaces
Create a Hugging Face Account: If you don't have one, sign up at huggingface.co.
Create a New Space:
- Go to huggingface.co/new-space.
- Enter a Space name (e.g.,
scrape-anythings). - Select Streamlit as the Space SDK.
- Choose Create a new repository for this Space.
- Click Create Space.
Upload Your Files:
- In your new Space, go to the Files tab.
- Click Upload files.
- Drag and drop all the files from your project folder:
app.pyscraper.pyyoutube_scraper.pyinstagram_scraper.pyinstagram_scraper_v2.pyrequirements.txtREADME.md
- Commit the files directly to the
mainbranch.
Done! Hugging Face will automatically build and launch your application. You can share the URL of your Space with anyone.
📋 How to Use the App
- Enter a URL: Paste the URL of the website, YouTube video, or Instagram profile you want to scrape.
- Select Data Types: Choose the data you want to extract.
- Click Scrape!: Let the app do the work.
- View & Download: See the results directly in the app and download them in your preferred format.
- Real-time scraping status
- Custom CSS selectors
- Proxy support
- Multi-language support
🤝 Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Test thoroughly
- Submit a pull request
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🙏 Acknowledgments
- Streamlit team for the amazing web app framework
- BeautifulSoup and Selenium communities
- Hugging Face for hosting capabilities
Made with ❤️ for the AI/ML community