--- title: Crawl4AI Web Content Extractor emoji: 🕷️ colorFrom: blue colorTo: indigo sdk: docker pinned: false --- # Crawl4AI Demo - Docker Deployment This is a Docker-ready version of the Crawl4AI demo application, specifically designed for deployment on Hugging Face Spaces. ## Features - Web interface built with Gradio - Support for multiple crawler types (Basic, LLM, Cosine, JSON/CSS) - Configurable word count threshold - Markdown output with metadata - Sub-page crawling capabilities - Lazy loading support - Docker-optimized configuration ## Deployment Instructions 1. Create a new Space on Hugging Face: - Go to huggingface.co/spaces - Click "Create new Space" - Choose "Docker" as the SDK - Set the hardware requirements (recommended: CPU + 16GB RAM) 2. Upload the files: - Upload all files from this directory to your Space - Make sure to include: - `Dockerfile` - `app.py` - `requirements.txt` - `README.md` 3. The Space will automatically build and deploy the application. ## Environment Variables No environment variables are required for basic functionality. The application is configured to run out of the box. ## Hardware Requirements - CPU: 2+ cores recommended - RAM: 16GB recommended - Disk: 5GB minimum ## Browser Support The application uses Chrome in headless mode for web crawling. The Dockerfile includes all necessary dependencies. ## Limitations - Memory usage increases with the number of pages crawled - Some websites may block automated crawling - JavaScript-heavy sites may require additional configuration ## Troubleshooting If you encounter issues: 1. Check the Space logs for error messages 2. Ensure the Chrome browser is running correctly 3. Verify network connectivity 4. Check memory usage ## Development To run locally with Docker: ```bash docker build -t crawl4ai-demo . docker run -p 7860:7860 crawl4ai-demo ``` Visit http://localhost:7860 to access the application. ## License This project is licensed under the MIT License - see the LICENSE file for details.