Spaces:
Sleeping
Sleeping
ScrapeRL Comprehensive Test Report
Generated: 2026-04-05 15:51:44
Test Summary
| Test # | Target | Instructions | Format | Status | Steps |
|---|---|---|---|---|---|
| 1 | HackerNews | Top 10 headlines | JSON | β PASS | 19 |
| 2 | Wikipedia | AI article info | JSON | β PASS | 25 |
| 3 | StackOverflow | Top voted questions | JSON | β PASS | 19 |
| 4 | PyPI | NumPy package info | JSON | β PASS | 19 |
| 5 | Programming posts | JSON | β PASS | 19 | |
| 6 | MDN Docs | JavaScript overview | Markdown | β PASS | 25 |
| 7 | DuckDuckGo | ML search results | JSON | β PASS | 19 |
| 8 | GitHub | VSCode repo stats | JSON | β PASS | 19 |
| 9 | NPM | React package details | JSON | β PASS | 19 |
| 10 | Kaggle | Popular datasets | CSV | β PASS | 25 |
Results: 10/10 Tests Passed (100%)
Intelligent Navigation Features Tested
- β GitHub Trending detection and navigation
- β Multi-field extraction (title, content, links, meta, images, data, scripts, forms, tables)
- β CSV output format generation
- β JSON output format generation
- β Markdown output format generation
- β Memory persistence
- β Plugin integration (mcp-browser, mcp-html, skill-extractor, skill-navigator)
- β Sandbox artifact creation
GitHub Trending Scraper Test
Requested: "Get me all trending repo" from https://github.com Result: Successfully navigated to GitHub trending page and extracted:
- 8 trending repositories with username, repo_name, stars, forks
- CSV output generated and saved to sandbox
Sample Extracted Data (GitHub Trending)
\\csv username,repo_name,stars,forks Blaizzy,mlx-vlm,"3,749",410 onyx-dot-app,onyx,"24,566","3,294" Yeachan-Heo,oh-my-codex,"16,124","1,521" siddharthvaddem,openscreen,"21,264","1,445" telegramdesktop,tdesktop,"30,915","6,527" block,goose,"35,957","3,383" microsoft,agent-framework,"8,838","1,447" sherlock-project,sherlock,"79,692","9,277" \\
Configuration
- Backend: FastAPI on port 8000
- Frontend: Vite/React on port 3000
- AI Provider: NVIDIA (llama-3.3-70b)
- Docker: docker-compose.yml
Conclusion
The ScrapeRL intelligent agentic scraper is fully operational with:
- Intelligent navigation based on user instructions
- GitHub trending repository extraction
- Multi-format output (JSON, CSV, Markdown)
- Plugin system integration
- Memory persistence
- Sandbox artifact management