--- title: UniversalScrap emoji: 👀 colorFrom: blue colorTo: green sdk: gradio sdk_version: 5.34.2 app_file: app.py pinned: false --- # 🚀 Universal Web Scraper A powerful web scraping tool that can handle **ANY** website, including JavaScript-heavy single-page applications (SPAs). Built with Playwright and designed for Hugging Face Spaces. ## ✨ Features - **🎯 Universal Compatibility**: Scrapes static HTML and JavaScript-rendered content - **🔄 Recursive Crawling**: Automatically follows and scrapes all internal links - **📊 Smart Content Extraction**: Converts HTML to clean, readable text - **💾 Multiple Export Options**: Individual TXT files + ZIP download - **🛡️ Failure-Resistant**: Multiple fallback methods ensure success - **⚡ Optimized Performance**: Rate limiting and timeout handling ## 🚀 Perfect For - Documentation websites - E-commerce sites - News portals - Blogs and content sites - Single-page applications (React, Vue, Angular) - Any website with dynamic content ## 🛠️ How It Works 1. **Primary Method**: Uses Playwright to handle JavaScript-heavy sites 2. **Fallback Method**: Uses aiohttp for static content if Playwright fails 3. **Content Processing**: Extracts clean text and all internal links 4. **Recursive Discovery**: Follows links up to specified depth 5. **File Generation**: Creates individual TXT files for each page Built with ❤️ for the web scraping community.