Spaces:

PHOROTHA913
/

Scrape-Anythings

Sleeping

App Files Files Community

Scrape-Anythings / README.md

PHOROTHA913

Upload 9 files

5c3dc0d verified 5 months ago

preview code

raw

history blame contribute delete

2.71 kB

	---
	title: Scrape Anythings
	emoji: ✨
	colorFrom: blue
	colorTo: green
	sdk: streamlit
	sdk_version: "1.35.0"
	python_version: "3.9"
	app_file: app.py
	---

	# ✨ Scrape Anythings

	A user-friendly Streamlit web application for extracting data from any website, including special support for YouTube and Instagram.

	## 🌟 Features

	- Scrape Any URL: Paste any website, YouTube, or Instagram URL to start.
	- Multiple Data Types: Extract text, images, links, tables, numbers, and metadata.
	- Social Media Support: Scrape YouTube video info & comments, and Instagram profile details & posts.
	- Rich Data Export: Download your data in JSON, CSV, TXT, and structured Excel (.xlsx) formats.
	- Modern UI: A clean and simple interface for a smooth user experience.

	## 🚀 How to Deploy on Hugging Face Spaces

	1. Create a Hugging Face Account: If you don't have one, sign up at [huggingface.co](https://huggingface.co/).
	2. Create a New Space:
	* Go to [huggingface.co/new-space](https://huggingface.co/new-space).
	* Enter a Space name (e.g., `scrape-anythings`).
	* Select Streamlit as the Space SDK.
	* Choose Create a new repository for this Space.
	* Click Create Space.
	3. Upload Your Files:
	* In your new Space, go to the Files tab.
	* Click Upload files.
	* Drag and drop all the files from your project folder:
	* `app.py`
	* `scraper.py`
	* `youtube_scraper.py`
	* `instagram_scraper.py`
	* `instagram_scraper_v2.py`
	* `requirements.txt`
	* `README.md`
	* Commit the files directly to the `main` branch.

	4. Done! Hugging Face will automatically build and launch your application. You can share the URL of your Space with anyone.

	## 📋 How to Use the App

	1. Enter a URL: Paste the URL of the website, YouTube video, or Instagram profile you want to scrape.
	2. Select Data Types: Choose the data you want to extract.
	3. Click Scrape!: Let the app do the work.
	4. View & Download: See the results directly in the app and download them in your preferred format.

	- [ ] Real-time scraping status
	- [ ] Custom CSS selectors
	- [ ] Proxy support
	- [ ] Multi-language support

	## 🤝 Contributing

	1. Fork the repository
	2. Create a feature branch
	3. Make your changes
	4. Test thoroughly
	5. Submit a pull request

	## 📄 License

	This project is licensed under the MIT License - see the LICENSE file for details.

	## 🙏 Acknowledgments

	- Streamlit team for the amazing web app framework
	- BeautifulSoup and Selenium communities
	- Hugging Face for hosting capabilities

	---

	Made with ❤️ for the AI/ML community